The patch titled
Subject: mm/page_alloc: fix memmap_init_zone pageblock alignment
has been added to the -mm tree. Its filename is
mm-page_alloc-fix-memmap_init_zone-pageblock-alignment.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-fix-memmap_init_zone…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-fix-memmap_init_zone…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Daniel Vacek <neelx(a)redhat.com>
Subject: mm/page_alloc: fix memmap_init_zone pageblock alignment
b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where
possible") introduced a bug where move_freepages() triggers a VM_BUG_ON()
on uninitialized page structure due to pageblock alignment. To fix this,
simply align the skipped pfns in memmap_init_zone() the same way as in
move_freepages_block().
Seen in one of the RHEL reports:
crash> log | grep -e BUG -e RIP -e Call.Trace -e move_freepages_block -e rmqueue -e freelist -A1
kernel BUG at mm/page_alloc.c:1389!
invalid opcode: 0000 [#1] SMP
--
RIP: 0010:[<ffffffff8118833e>] [<ffffffff8118833e>] move_freepages+0x15e/0x160
RSP: 0018:ffff88054d727688 EFLAGS: 00010087
--
Call Trace:
[<ffffffff811883b3>] move_freepages_block+0x73/0x80
[<ffffffff81189e63>] __rmqueue+0x263/0x460
[<ffffffff8118c781>] get_page_from_freelist+0x7e1/0x9e0
[<ffffffff8118caf6>] __alloc_pages_nodemask+0x176/0x420
--
RIP [<ffffffff8118833e>] move_freepages+0x15e/0x160
RSP <ffff88054d727688>
crash> page_init_bug -v | grep RAM
<struct resource 0xffff88067fffd2f8> 1000 - 9bfff System RAM (620.00 KiB)
<struct resource 0xffff88067fffd3a0> 100000 - 430bffff System RAM ( 1.05 GiB = 1071.75 MiB = 1097472.00 KiB)
<struct resource 0xffff88067fffd410> 4b0c8000 - 4bf9cfff System RAM ( 14.83 MiB = 15188.00 KiB)
<struct resource 0xffff88067fffd480> 4bfac000 - 646b1fff System RAM (391.02 MiB = 400408.00 KiB)
<struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB)
<struct resource 0xffff88067fffd640> 100000000 - 67fffffff System RAM ( 22.00 GiB)
crash> page_init_bug | head -6
<struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB)
<struct page 0xffffea0001ede200> 1fffff00000000 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575
<struct page 0xffffea0001ede200> 505736 505344 <struct page 0xffffea0001ed8000> 505855 <struct page 0xffffea0001edffc0>
<struct page 0xffffea0001ed8000> 0 0 <struct pglist_data 0xffff88047ffd9000> 0 <struct zone 0xffff88047ffd9000> DMA 1 4095
<struct page 0xffffea0001edffc0> 1fffff00000400 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575
BUG, zones differ!
Note that this range follows two not populated sections 68000000-77ffffff
in this zone. 7b788000-7b7fffff is the first one after a gap. This makes
memmap_init_zone() skip all the pfns up to the beginning of this range.
But this range is not pageblock (2M) aligned. In fact no range has to be.
crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b787000 7b788000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffea0001e00000 78000000 0 0 0 0
ffffea0001ed7fc0 7b5ff000 0 0 0 0
ffffea0001ed8000 7b600000 0 0 0 0 <<<<
ffffea0001ede1c0 7b787000 0 0 0 0
ffffea0001ede200 7b788000 0 0 1 1fffff00000000
Top part of page flags should contain nodeid and zonenr, which is not
the case for page ffffea0001ed8000 here (<<<<).
crash> log | grep -o fffea0001ed[^\ ]* | sort -u
fffea0001ed8000
fffea0001eded20
fffea0001edffc0
crash> bt -r | grep -o fffea0001ed[^\ ]* | sort -u
fffea0001ed8000
fffea0001eded00
fffea0001eded20
fffea0001edffc0
Initialization of the whole beginning of the section is skipped up to the
start of the range due to the commit b92df1de5d28. Now any code calling
move_freepages_block() (like reusing the page from a freelist as in this
example) with a page from the beginning of the range will get the page
rounded down to start_page ffffea0001ed8000 and passed to move_freepages()
which crashes on assertion getting wrong zonenr.
> VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
Note, page_zone() derives the zone from page flags here.
>From similar machine before commit b92df1de5d28:
crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffff73941e00000 78000000 0 0 1 1fffff00000000
fffff73941ed7fc0 7b5ff000 0 0 1 1fffff00000000
fffff73941ed8000 7b600000 0 0 1 1fffff00000000
fffff73941edff80 7b7fe000 0 0 1 1fffff00000000
fffff73941edffc0 7b7ff000 ffff8e67e04d3ae0 ad84 1 1fffff00020068 uptodate,lru,active,mappedtodisk
All the pages since the beginning of the section are initialized.
move_freepages()' not gonna blow up.
The same machine with this fix applied:
crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffea0001e00000 78000000 0 0 0 0
ffffea0001e00000 7b5ff000 0 0 0 0
ffffea0001ed8000 7b600000 0 0 1 1fffff00000000
ffffea0001edff80 7b7fe000 0 0 1 1fffff00000000
ffffea0001edffc0 7b7ff000 ffff88017fb13720 8 2 1fffff00020068 uptodate,lru,active,mappedtodisk
At least the bare minimum of pages is initialized preventing the crash as well.
Link: http://lkml.kernel.org/r/0485727b2e82da7efbce5f6ba42524b429d0391a.152001194…
Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Paul Burton <paul.burton(a)imgtec.com>
Cc: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff -puN mm/page_alloc.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment mm/page_alloc.c
--- a/mm/page_alloc.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment
+++ a/mm/page_alloc.c
@@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned
/*
* Skip to the pfn preceding the next valid one (or
* end_pfn), such that we hit a valid pfn (or end_pfn)
- * on our next iteration of the loop.
+ * on our next iteration of the loop. Note that it needs
+ * to be pageblock aligned even when the region itself
+ * is not. move_freepages_block() can shift ahead of
+ * the valid region but still depends on correct page
+ * metadata.
*/
- pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+ pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
+ ~(pageblock_nr_pages-1)) - 1;
#endif
continue;
}
_
Patches currently in -mm which might be from neelx(a)redhat.com are
mm-memblock-hardcode-the-end_pfn-being-1.patch
mm-page_alloc-fix-memmap_init_zone-pageblock-alignment.patch
Hi All,
Resent without non-upstream patches.
This backport patchset fixed the spectre issue, it's original branch:
https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=kpti
A few dependency or fixingpatches are also picked up, if they are necessary
and no functional changes.
No bug found from kernelci.org and lkft testing. It also could be gotten from:
git://git.linaro.org/kernel/linux-linaro-stable.git v4.9-spectre-upstream-only
Comments are appreciated!
Regards
Alex
[PATCH 01/45] mm: Introduce lm_alias
[PATCH 02/45] arm64: alternatives: apply boot time fixups via the
[PATCH 03/45] arm64: barrier: Add CSDB macros to control data-value
[PATCH 04/45] arm64: Implement array_index_mask_nospec()
[PATCH 05/45] arm64: move TASK_* definitions to <asm/processor.h>
[PATCH 06/45] arm64: Factor out PAN enabling/disabling into separate
[PATCH 07/45] arm64: Factor out TTBR0_EL1 post-update workaround into
[PATCH 08/45] arm64: uaccess: consistently check object sizes
[PATCH 09/45] arm64: Make USER_DS an inclusive limit
[PATCH 10/45] arm64: Use pointer masking to limit uaccess speculation
[PATCH 11/45] arm64: syscallno is secretly an int, make it official
[PATCH 12/45] arm64: entry: Ensure branch through syscall table is
[PATCH 13/45] arm64: uaccess: Prevent speculative use of the current
[PATCH 14/45] arm64: uaccess: Don't bother eliding access_ok checks
[PATCH 15/45] arm64: uaccess: Mask __user pointers for __arch_{clear,
[PATCH 16/45] arm64: futex: Mask __user pointers prior to dereference
[PATCH 17/45] drivers/firmware: Expose psci_get_version through
[PATCH 18/45] arm64: cpufeature: __this_cpu_has_cap() shouldn't stop
[PATCH 19/45] arm64: cpu_errata: Allow an erratum to be match for all
[PATCH 20/45] arm64: Run enable method for errata work arounds on
[PATCH 21/45] arm64: cpufeature: Pass capability structure to
[PATCH 22/45] arm64: Move post_ttbr_update_workaround to C code
[PATCH 23/45] arm64: Add skeleton to harden the branch predictor
[PATCH 24/45] arm64: Move BP hardening to check_and_switch_context
[PATCH 25/45] arm64: KVM: Use per-CPU vector when BP hardening is
[PATCH 26/45] arm64: entry: Apply BP hardening for high-priority
[PATCH 27/45] arm64: entry: Apply BP hardening for suspicious
[PATCH 28/45] arm64: cputype: Add missing MIDR values for Cortex-A72
[PATCH 29/45] arm64: Implement branch predictor hardening for
[PATCH 30/45] arm64: KVM: Increment PC after handling an SMC trap
[PATCH 31/45] arm/arm64: KVM: Consolidate the PSCI include files
[PATCH 32/45] arm/arm64: KVM: Add PSCI_VERSION helper
[PATCH 33/45] arm/arm64: KVM: Add smccc accessors to PSCI code
[PATCH 34/45] arm/arm64: KVM: Implement PSCI 1.0 support
[PATCH 35/45] arm/arm64: KVM: Advertise SMCCC v1.1
[PATCH 36/45] arm64: KVM: Make PSCI_VERSION a fast path
[PATCH 37/45] arm/arm64: KVM: Turn kvm_psci_version into a static
[PATCH 38/45] arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening
[PATCH 39/45] arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling
[PATCH 40/45] firmware/psci: Expose PSCI conduit
[PATCH 41/45] firmware/psci: Expose SMCCC version through psci_ops
[PATCH 42/45] arm/arm64: smccc: Make function identifiers an unsigned
[PATCH 43/45] arm/arm64: smccc: Implement SMCCC v1.1 inline primitive
[PATCH 44/45] arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening
[PATCH 45/45] arm64: Kill PSCI_GET_VERSION as a variant-2 workaround
The patch titled
Subject: mm-memblock-hardcode-the-end_pfn-being-1-fix
has been added to the -mm tree. Its filename is
mm-memblock-hardcode-the-end_pfn-being-1-fix.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-memblock-hardcode-the-end_pfn-b…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-memblock-hardcode-the-end_pfn-b…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Andrew Morton <akpm(a)linux-foundation.org>
Subject: mm-memblock-hardcode-the-end_pfn-being-1-fix
make it work against current -linus, not against -mm
Cc: Daniel Vacek <neelx(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Paul Burton <paul.burton(a)imgtec.com>
Cc: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN mm/page_alloc.c~mm-memblock-hardcode-the-end_pfn-being-1-fix mm/page_alloc.c
--- a/mm/page_alloc.c~mm-memblock-hardcode-the-end_pfn-being-1-fix
+++ a/mm/page_alloc.c
@@ -5361,7 +5361,7 @@ void __meminit memmap_init_zone(unsigned
* end_pfn), such that we hit a valid pfn (or end_pfn)
* on our next iteration of the loop.
*/
- pfn = memblock_next_valid_pfn(pfn) - 1;
+ pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
#endif
continue;
}
_
Patches currently in -mm which might be from akpm(a)linux-foundation.org are
i-need-old-gcc.patch
mm-memblock-hardcode-the-end_pfn-being-1-fix.patch
arm-arch-arm-include-asm-pageh-needs-personalityh.patch
mm.patch
mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma-fix.patch
direct-io-minor-cleanups-in-do_blockdev_direct_io-fix.patch
list_lru-prefetch-neighboring-list-entries-before-acquiring-lock-fix.patch
mm-oom-cgroup-aware-oom-killer-fix.patch
mm-oom-docs-describe-the-cgroup-aware-oom-killer-fix-2-fix.patch
proc-add-seq_put_decimal_ull_width-to-speed-up-proc-pid-smaps-fix.patch
sysctl-add-flags-to-support-min-max-range-clamping-fix.patch
linux-next-git-rejects.patch
kernel-forkc-export-kernel_thread-to-modules.patch
slab-leaks3-default-y.patch
The patch titled
Subject: mm/memblock.c: hardcode the end_pfn being -1
has been added to the -mm tree. Its filename is
mm-memblock-hardcode-the-end_pfn-being-1.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-memblock-hardcode-the-end_pfn-b…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-memblock-hardcode-the-end_pfn-b…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Daniel Vacek <neelx(a)redhat.com>
Subject: mm/memblock.c: hardcode the end_pfn being -1
This is just a cleanup. It aids handling the special end case in the next
commit.
Link: http://lkml.kernel.org/r/1ca478d4269125a99bcfb1ca04d7b88ac1aee924.152001194…
Signed-off-by: Daniel Vacek <neelx(a)redhat.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Cc: Paul Burton <paul.burton(a)imgtec.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memblock.c | 13 ++++++-------
mm/page_alloc.c | 2 +-
2 files changed, 7 insertions(+), 8 deletions(-)
diff -puN mm/memblock.c~mm-memblock-hardcode-the-end_pfn-being-1 mm/memblock.c
--- a/mm/memblock.c~mm-memblock-hardcode-the-end_pfn-being-1
+++ a/mm/memblock.c
@@ -1101,13 +1101,12 @@ void __init_memblock __next_mem_pfn_rang
*out_nid = r->nid;
}
-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
- unsigned long max_pfn)
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
{
struct memblock_type *type = &memblock.memory;
unsigned int right = type->cnt;
unsigned int mid, left = 0;
- phys_addr_t addr = PFN_PHYS(pfn + 1);
+ phys_addr_t addr = PFN_PHYS(++pfn);
do {
mid = (right + left) / 2;
@@ -1118,15 +1117,15 @@ unsigned long __init_memblock memblock_n
type->regions[mid].size))
left = mid + 1;
else {
- /* addr is within the region, so pfn + 1 is valid */
- return min(pfn + 1, max_pfn);
+ /* addr is within the region, so pfn is valid */
+ return pfn;
}
} while (left < right);
if (right == type->cnt)
- return max_pfn;
+ return -1UL;
else
- return min(PHYS_PFN(type->regions[right].base), max_pfn);
+ return PHYS_PFN(type->regions[right].base);
}
/**
diff -puN mm/page_alloc.c~mm-memblock-hardcode-the-end_pfn-being-1 mm/page_alloc.c
--- a/mm/page_alloc.c~mm-memblock-hardcode-the-end_pfn-being-1
+++ a/mm/page_alloc.c
@@ -5361,7 +5361,7 @@ void __meminit memmap_init_zone(unsigned
* end_pfn), such that we hit a valid pfn (or end_pfn)
* on our next iteration of the loop.
*/
- pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+ pfn = memblock_next_valid_pfn(pfn) - 1;
#endif
continue;
}
_
Patches currently in -mm which might be from neelx(a)redhat.com are
mm-memblock-hardcode-the-end_pfn-being-1.patch
mm-page_alloc-fix-memmap_init_zone-pageblock-alignment.patch
The patch titled
Subject: mm/page_alloc: fix memmap_init_zone pageblock alignment
has been removed from the -mm tree. Its filename was
mm-page_alloc-fix-memmap_init_zone-pageblock-alignment.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Daniel Vacek <neelx(a)redhat.com>
Subject: mm/page_alloc: fix memmap_init_zone pageblock alignment
BUG at mm/page_alloc.c:1913
> VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") introduced a bug where move_freepages() triggers a
VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
To fix this, simply align the skipped pfns in memmap_init_zone() the same
way as in move_freepages_block().
Link: http://lkml.kernel.org/r/1519988497-28941-1-git-send-email-neelx@redhat.com
Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx(a)redhat.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Cc: Paul Burton <paul.burton(a)imgtec.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memblock.c | 13 ++++++-------
mm/page_alloc.c | 9 +++++++--
2 files changed, 13 insertions(+), 9 deletions(-)
diff -puN mm/memblock.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment mm/memblock.c
--- a/mm/memblock.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment
+++ a/mm/memblock.c
@@ -1101,13 +1101,12 @@ void __init_memblock __next_mem_pfn_rang
*out_nid = r->nid;
}
-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
- unsigned long max_pfn)
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
{
struct memblock_type *type = &memblock.memory;
unsigned int right = type->cnt;
unsigned int mid, left = 0;
- phys_addr_t addr = PFN_PHYS(pfn + 1);
+ phys_addr_t addr = PFN_PHYS(++pfn);
do {
mid = (right + left) / 2;
@@ -1118,15 +1117,15 @@ unsigned long __init_memblock memblock_n
type->regions[mid].size))
left = mid + 1;
else {
- /* addr is within the region, so pfn + 1 is valid */
- return min(pfn + 1, max_pfn);
+ /* addr is within the region, so pfn is valid */
+ return pfn;
}
} while (left < right);
if (right == type->cnt)
- return max_pfn;
+ return -1UL;
else
- return min(PHYS_PFN(type->regions[right].base), max_pfn);
+ return PHYS_PFN(type->regions[right].base);
}
/**
diff -puN mm/page_alloc.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment mm/page_alloc.c
--- a/mm/page_alloc.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment
+++ a/mm/page_alloc.c
@@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned
/*
* Skip to the pfn preceding the next valid one (or
* end_pfn), such that we hit a valid pfn (or end_pfn)
- * on our next iteration of the loop.
+ * on our next iteration of the loop. Note that it needs
+ * to be pageblock aligned even when the region itself
+ * is not as move_freepages_block() can shift ahead of
+ * the valid region but still depends on correct page
+ * metadata.
*/
- pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+ pfn = (memblock_next_valid_pfn(pfn) &
+ ~(pageblock_nr_pages-1)) - 1;
#endif
continue;
}
_
Patches currently in -mm which might be from neelx(a)redhat.com are
mm-memblock-hardcode-the-end_pfn-being-1.patch
The patch titled
Subject: mm/page_alloc: fix memmap_init_zone pageblock alignment
has been added to the -mm tree. Its filename is
mm-page_alloc-fix-memmap_init_zone-pageblock-alignment.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-fix-memmap_init_zone…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-fix-memmap_init_zone…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/SubmitChecklist when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Daniel Vacek <neelx(a)redhat.com>
Subject: mm/page_alloc: fix memmap_init_zone pageblock alignment
BUG at mm/page_alloc.c:1913
> VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") introduced a bug where move_freepages() triggers a
VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
To fix this, simply align the skipped pfns in memmap_init_zone() the same
way as in move_freepages_block().
Link: http://lkml.kernel.org/r/1519988497-28941-1-git-send-email-neelx@redhat.com
Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx(a)redhat.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Cc: Paul Burton <paul.burton(a)imgtec.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memblock.c | 13 ++++++-------
mm/page_alloc.c | 9 +++++++--
2 files changed, 13 insertions(+), 9 deletions(-)
diff -puN mm/memblock.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment mm/memblock.c
--- a/mm/memblock.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment
+++ a/mm/memblock.c
@@ -1101,13 +1101,12 @@ void __init_memblock __next_mem_pfn_rang
*out_nid = r->nid;
}
-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
- unsigned long max_pfn)
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
{
struct memblock_type *type = &memblock.memory;
unsigned int right = type->cnt;
unsigned int mid, left = 0;
- phys_addr_t addr = PFN_PHYS(pfn + 1);
+ phys_addr_t addr = PFN_PHYS(++pfn);
do {
mid = (right + left) / 2;
@@ -1118,15 +1117,15 @@ unsigned long __init_memblock memblock_n
type->regions[mid].size))
left = mid + 1;
else {
- /* addr is within the region, so pfn + 1 is valid */
- return min(pfn + 1, max_pfn);
+ /* addr is within the region, so pfn is valid */
+ return pfn;
}
} while (left < right);
if (right == type->cnt)
- return max_pfn;
+ return -1UL;
else
- return min(PHYS_PFN(type->regions[right].base), max_pfn);
+ return PHYS_PFN(type->regions[right].base);
}
/**
diff -puN mm/page_alloc.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment mm/page_alloc.c
--- a/mm/page_alloc.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment
+++ a/mm/page_alloc.c
@@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned
/*
* Skip to the pfn preceding the next valid one (or
* end_pfn), such that we hit a valid pfn (or end_pfn)
- * on our next iteration of the loop.
+ * on our next iteration of the loop. Note that it needs
+ * to be pageblock aligned even when the region itself
+ * is not as move_freepages_block() can shift ahead of
+ * the valid region but still depends on correct page
+ * metadata.
*/
- pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+ pfn = (memblock_next_valid_pfn(pfn) &
+ ~(pageblock_nr_pages-1)) - 1;
#endif
continue;
}
_
Patches currently in -mm which might be from neelx(a)redhat.com are
mm-page_alloc-fix-memmap_init_zone-pageblock-alignment.patch
This is the start of the stable review cycle for the 3.18.98 release.
There are 24 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun Mar 4 08:42:22 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.98-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 3.18.98-rc1
Yangbo Lu <yangbo.lu(a)nxp.com>
net: gianfar_ptp: move set_fipers() to spinlock protecting area
Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
sctp: make use of pre-calculated len
Ross Lagerwall <ross.lagerwall(a)citrix.com>
xen/gntdev: Fix partial gntdev_mmap() cleanup
Ross Lagerwall <ross.lagerwall(a)citrix.com>
xen/gntdev: Fix off-by-one error when unmapping with holes
Sergei Shtylyov <sergei.shtylyov(a)cogentembedded.com>
SolutionEngine771x: fix Ether platform data
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
mdio-sun4i: Fix a memory leak
Eduardo Otubo <otubo(a)redhat.com>
xen-netfront: enable device after manual module load
Xiongwei Song <sxwjean(a)gmail.com>
drm/ttm: check the return value of kzalloc
Tushar Dave <tushar.n.dave(a)oracle.com>
e1000: fix disabling already-disabled warning
Aliaksei Karaliou <akaraliou.dev(a)gmail.com>
xfs: quota: check result of register_shrinker()
Aliaksei Karaliou <akaraliou.dev(a)gmail.com>
xfs: quota: fix missed destroy of qi_tree_lock
Stefan Haberland <sth(a)linux.vnet.ibm.com>
s390/dasd: fix wrongly assigned configuration data
Matthieu CASTET <matthieu.castet(a)parrot.com>
led: core: Fix brightness setting when setting delay_off=0
Guilherme G. Piccoli <gpiccoli(a)linux.vnet.ibm.com>
bnx2x: Improve reliability in case of nested PCI errors
Siva Reddy Kallam <siva.kallam(a)broadcom.com>
tg3: Enable PHY reset in MTU change path for 5720
Siva Reddy Kallam <siva.kallam(a)broadcom.com>
tg3: Add workaround to restrict 5762 MRRS to 2048
Cathy Avery <cavery(a)redhat.com>
scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error
Alexander Kochetkov <al.kochet(a)gmail.com>
net: arc_emac: fix arc_emac_rx() error paths
Radu Pirea <radu.pirea(a)microchip.com>
spi: atmel: fixed spin_lock usage inside atmel_spi_remove
Al Viro <viro(a)zeniv.linux.org.uk>
sget(): handle failures of register_shrinker()
Brendan McGrath <redmcg(a)redmandi.dyndns.org>
ipv6: icmp6: Allow icmp messages to be looped back
Sascha Hauer <s.hauer(a)pengutronix.de>
mtd: nand: gpmi: Fix failure when a erased page has a bitflip at BBM
Anna-Maria Gleixner <anna-maria(a)linutronix.de>
hrtimer: Ensure POSIX compliance (relative CLOCK_REALTIME hrtimers)
Jakub Sitnicki <jkbs(a)redhat.com>
ipv6: Skip XFRM lookup if dst_entry in socket cache is valid
-------------
Diffstat:
Makefile | 4 +-
arch/sh/boards/mach-se/770x/setup.c | 10 ++++-
drivers/gpu/drm/ttm/ttm_page_alloc.c | 2 +
drivers/leds/led-core.c | 2 +-
drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 6 +--
drivers/net/ethernet/arc/emac_main.c | 53 ++++++++++++++----------
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 4 +-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 14 ++++++-
drivers/net/ethernet/broadcom/tg3.c | 13 +++++-
drivers/net/ethernet/broadcom/tg3.h | 4 ++
drivers/net/ethernet/freescale/gianfar_ptp.c | 3 +-
drivers/net/ethernet/intel/e1000/e1000.h | 3 +-
drivers/net/ethernet/intel/e1000/e1000_main.c | 27 +++++++++---
drivers/net/phy/mdio-sun4i.c | 6 ++-
drivers/net/xen-netfront.c | 1 +
drivers/s390/block/dasd_3990_erp.c | 10 +++++
drivers/scsi/storvsc_drv.c | 3 +-
drivers/spi/spi-atmel.c | 2 +-
drivers/xen/gntdev.c | 8 ++--
fs/super.c | 6 ++-
fs/xfs/xfs_qm.c | 46 +++++++++++++-------
kernel/time/hrtimer.c | 7 +++-
net/ipv6/ip6_output.c | 11 ++---
net/ipv6/route.c | 1 +
net/sctp/socket.c | 16 ++++---
25 files changed, 180 insertions(+), 82 deletions(-)
This is the start of the stable review cycle for the 4.9.86 release.
There are 56 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun Mar 4 08:44:26 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.86-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.86-rc1
James Hogan <jhogan(a)kernel.org>
MIPS: Implement __multi3 for GCC7 MIPS64r6 builds
Punit Agrawal <punit.agrawal(a)arm.com>
KVM: arm/arm64: Fix check for hugepage size when allocating at Stage 2
Yangbo Lu <yangbo.lu(a)nxp.com>
net: gianfar_ptp: move set_fipers() to spinlock protecting area
Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
sctp: make use of pre-calculated len
Ross Lagerwall <ross.lagerwall(a)citrix.com>
xen/gntdev: Fix partial gntdev_mmap() cleanup
Ross Lagerwall <ross.lagerwall(a)citrix.com>
xen/gntdev: Fix off-by-one error when unmapping with holes
Sergei Shtylyov <sergei.shtylyov(a)cogentembedded.com>
SolutionEngine771x: fix Ether platform data
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
mdio-sun4i: Fix a memory leak
Eduardo Otubo <otubo(a)redhat.com>
xen-netfront: enable device after manual module load
Venkat Duvvuru <venkatkumar.duvvuru(a)broadcom.com>
bnxt_en: Fix the 'Invalid VF' id check in bnxt_vf_ndo_prep routine.
Luu An Phu <phu.luuan(a)nxp.com>
can: flex_can: Correct the checking for frame length in flexcan_start_xmit()
Johannes Berg <johannes.berg(a)intel.com>
mac80211: mesh: drop frames appearing to be from us
Hao Chen <flank3rsky(a)gmail.com>
nl80211: Check for the required netlink attribute presence
Alexander Duyck <alexander.h.duyck(a)intel.com>
i40e/i40evf: Account for frags split over multiple descriptors in check linearize
Felix Janda <felix.janda(a)posteo.de>
uapi libc compat: add fallback for unsupported libcs
Xiongwei Song <sxwjean(a)gmail.com>
drm/ttm: check the return value of kzalloc
SZ Lin (林上智) <sz.lin(a)moxa.com>
NET: usb: qmi_wwan: add support for YUGA CLM920-NC5 PID 0x9625
Tushar Dave <tushar.n.dave(a)oracle.com>
e1000: fix disabling already-disabled warning
Gao Feng <gfree.wind(a)vip.163.com>
macvlan: Fix one possible double free
Aliaksei Karaliou <akaraliou.dev(a)gmail.com>
xfs: quota: check result of register_shrinker()
Aliaksei Karaliou <akaraliou.dev(a)gmail.com>
xfs: quota: fix missed destroy of qi_tree_lock
Erez Shitrit <erezsh(a)mellanox.com>
IB/ipoib: Fix race condition in neigh creation
Leon Romanovsky <leonro(a)mellanox.com>
IB/mlx4: Fix mlx4_ib_alloc_mr error flow
Stefan Haberland <sth(a)linux.vnet.ibm.com>
s390/dasd: fix wrongly assigned configuration data
Guenter Roeck <linux(a)roeck-us.net>
genirq: Guard handle_bad_irq log messages
Nitzan Carmi <nitzanc(a)mellanox.com>
IB/mlx5: Fix mlx5_ib_alloc_mr error flow
Matthieu CASTET <matthieu.castet(a)parrot.com>
led: core: Fix brightness setting when setting delay_off=0
Guilherme G. Piccoli <gpiccoli(a)linux.vnet.ibm.com>
bnx2x: Improve reliability in case of nested PCI errors
Siva Reddy Kallam <siva.kallam(a)broadcom.com>
tg3: Enable PHY reset in MTU change path for 5720
Siva Reddy Kallam <siva.kallam(a)broadcom.com>
tg3: Add workaround to restrict 5762 MRRS to 2048
Tommi Rantala <tommi.t.rantala(a)nokia.com>
tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path
Tommi Rantala <tommi.t.rantala(a)nokia.com>
tipc: error path leak fixes in tipc_enable_bearer()
James Hogan <jhogan(a)kernel.org>
lib/mpi: Fix umul_ppmm() for MIPS64r6
Arnd Bergmann <arnd(a)arndb.de>
ARM: dts: ls1021a: fix incorrect clock references
Cathy Avery <cavery(a)redhat.com>
scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error
Fredrik Hallenberg <megahallon(a)gmail.com>
net: stmmac: Fix TX timestamp calculation
Xin Long <lucien.xin(a)gmail.com>
ip6_tunnel: get the min mtu properly in ip6_tnl_xmit
Alexander Kochetkov <al.kochet(a)gmail.com>
net: arc_emac: fix arc_emac_rx() error paths
Sean Wang <sean.wang(a)mediatek.com>
net: mediatek: setup proper state for disabled GMAC on the default
Abhijeet Kumar <abhijeet.kumar(a)intel.com>
ASoC: nau8825: fix issue that pop noise when start capture
Radu Pirea <radu.pirea(a)microchip.com>
spi: atmel: fixed spin_lock usage inside atmel_spi_remove
Jia-Ju Bai <baijiaju1990(a)163.com>
mac80211_hwsim: Fix a possible sleep-in-atomic bug in hwsim_get_radio_nl
Karol Herbst <kherbst(a)redhat.com>
drm/nouveau/pci: do a msi rearm on init
Alexey Khoroshilov <khoroshilov(a)ispras.ru>
net: phy: xgene: disable clk on error paths
Al Viro <viro(a)zeniv.linux.org.uk>
sget(): handle failures of register_shrinker()
Arnaldo Carvalho de Melo <acme(a)redhat.com>
x86/asm: Allow again using asm.h when building for the 'bpf' clang target
Chunyan Zhang <zhang.lyra(a)gmail.com>
ARM: 8731/1: Fix csum_partial_copy_from_user() stack mismatch
Brendan McGrath <redmcg(a)redmandi.dyndns.org>
ipv6: icmp6: Allow icmp messages to be looped back
Albert Hsieh <wen.hsieh(a)broadcom.com>
mtd: nand: brcmnand: Zero bitflip is not an error
Sascha Hauer <s.hauer(a)pengutronix.de>
mtd: nand: gpmi: Fix failure when a erased page has a bitflip at BBM
Daniele Palmas <dnlplm(a)gmail.com>
net: usb: qmi_wwan: add Telit ME910 PID 0x1101 support
Keith Busch <keith.busch(a)intel.com>
nvme: check hw sectors before setting chunk sectors
Andreas Platschek <andreas.platschek(a)opentech.at>
dmaengine: fsl-edma: disable clks on all error paths
Yunlei He <heyunlei(a)huawei.com>
f2fs: fix a bug caused by NULL extent tree
Ben Gardner <gardner.ben(a)gmail.com>
i2c: designware: must wait for enable
Anna-Maria Gleixner <anna-maria(a)linutronix.de>
hrtimer: Ensure POSIX compliance (relative CLOCK_REALTIME hrtimers)
-------------
Diffstat:
Makefile | 4 +-
arch/arm/boot/dts/ls1021a-qds.dts | 2 +-
arch/arm/boot/dts/ls1021a-twr.dts | 2 +-
arch/arm/kvm/mmu.c | 2 +-
arch/arm/lib/csumpartialcopyuser.S | 4 ++
arch/mips/lib/Makefile | 3 +-
arch/mips/lib/libgcc.h | 17 +++++++
arch/mips/lib/multi3.c | 54 +++++++++++++++++++++
arch/sh/boards/mach-se/770x/setup.c | 10 +++-
arch/x86/include/asm/asm.h | 2 +
drivers/dma/fsl-edma.c | 28 +++++------
drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c | 7 +++
drivers/gpu/drm/ttm/ttm_page_alloc.c | 2 +
drivers/i2c/busses/i2c-designware-core.c | 2 +-
drivers/infiniband/hw/mlx4/mr.c | 2 +-
drivers/infiniband/hw/mlx5/mr.c | 1 +
drivers/infiniband/ulp/ipoib/ipoib_main.c | 25 +++++++---
drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 5 +-
drivers/leds/led-core.c | 2 +-
drivers/mtd/nand/brcmnand/brcmnand.c | 2 +-
drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 6 +--
drivers/net/can/flexcan.c | 2 +-
drivers/net/ethernet/arc/emac_main.c | 53 ++++++++++++---------
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 4 +-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 14 +++++-
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 2 +-
drivers/net/ethernet/broadcom/tg3.c | 13 ++++-
drivers/net/ethernet/broadcom/tg3.h | 4 ++
drivers/net/ethernet/freescale/gianfar_ptp.c | 3 +-
drivers/net/ethernet/intel/e1000/e1000.h | 3 +-
drivers/net/ethernet/intel/e1000/e1000_main.c | 27 +++++++++--
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 26 ++++++++--
drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 26 ++++++++--
drivers/net/ethernet/mediatek/mtk_eth_soc.c | 11 +++--
.../net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c | 6 ++-
drivers/net/macvlan.c | 7 ++-
drivers/net/phy/mdio-sun4i.c | 6 ++-
drivers/net/phy/mdio-xgene.c | 21 ++++++---
drivers/net/usb/qmi_wwan.c | 2 +
drivers/net/wireless/mac80211_hwsim.c | 2 +-
drivers/net/xen-netfront.c | 1 +
drivers/nvme/host/core.c | 3 +-
drivers/s390/block/dasd_3990_erp.c | 10 ++++
drivers/scsi/storvsc_drv.c | 3 +-
drivers/spi/spi-atmel.c | 2 +-
drivers/xen/gntdev.c | 8 ++--
fs/f2fs/extent_cache.c | 12 ++++-
fs/super.c | 6 ++-
fs/xfs/xfs_qm.c | 46 +++++++++++-------
include/uapi/linux/libc-compat.h | 55 +++++++++++++++++++++-
kernel/irq/debug.h | 5 ++
kernel/time/hrtimer.c | 7 ++-
lib/mpi/longlong.h | 18 ++++++-
net/ipv6/ip6_tunnel.c | 9 +++-
net/ipv6/route.c | 1 +
net/mac80211/rx.c | 2 +
net/sctp/socket.c | 16 ++++---
net/tipc/bearer.c | 5 +-
net/tipc/monitor.c | 6 ++-
net/wireless/nl80211.c | 3 +-
sound/soc/codecs/nau8825.c | 1 +
61 files changed, 498 insertions(+), 135 deletions(-)
The patch titled
Subject: mm/gup.c: teach get_user_pages_unlocked to handle FOLL_NOWAIT
has been added to the -mm tree. Its filename is
mm-gup-teach-get_user_pages_unlocked-to-handle-foll_nowait.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-gup-teach-get_user_pages_unlock…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-gup-teach-get_user_pages_unlock…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/SubmitChecklist when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Andrea Arcangeli <aarcange(a)redhat.com>
Subject: mm/gup.c: teach get_user_pages_unlocked to handle FOLL_NOWAIT
KVM is hanging during postcopy live migration with userfaultfd because
get_user_pages_unlocked is not capable to handle FOLL_NOWAIT.
Earlier FOLL_NOWAIT was only ever passed to get_user_pages.
Specifically faultin_page (the callee of get_user_pages_unlocked caller)
doesn't know that if FAULT_FLAG_RETRY_NOWAIT was set in the page fault
flags, when VM_FAULT_RETRY is returned, the mmap_sem wasn't actually
released (even if nonblocking is not NULL). So it sets *nonblocking to
zero and the caller won't release the mmap_sem thinking it was already
released, but it wasn't because of FOLL_NOWAIT.
Link: http://lkml.kernel.org/r/20180302174343.5421-2-aarcange@redhat.com
Fixes: ce53053ce378c ("kvm: switch get_user_page_nowait() to get_user_pages_unlocked()")
Signed-off-by: Andrea Arcangeli <aarcange(a)redhat.com>
Reported-by: Dr. David Alan Gilbert <dgilbert(a)redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert(a)redhat.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff -puN mm/gup.c~mm-gup-teach-get_user_pages_unlocked-to-handle-foll_nowait mm/gup.c
--- a/mm/gup.c~mm-gup-teach-get_user_pages_unlocked-to-handle-foll_nowait
+++ a/mm/gup.c
@@ -516,7 +516,7 @@ static int faultin_page(struct task_stru
}
if (ret & VM_FAULT_RETRY) {
- if (nonblocking)
+ if (nonblocking && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT))
*nonblocking = 0;
return -EBUSY;
}
@@ -890,7 +890,10 @@ static __always_inline long __get_user_p
break;
}
if (*locked) {
- /* VM_FAULT_RETRY didn't trigger */
+ /*
+ * VM_FAULT_RETRY didn't trigger or it was a
+ * FOLL_NOWAIT.
+ */
if (!pages_done)
pages_done = ret;
break;
_
Patches currently in -mm which might be from aarcange(a)redhat.com are
mm-gup-teach-get_user_pages_unlocked-to-handle-foll_nowait.patch