From: Nick Desaulniers <ndesaulniers(a)google.com>
Subject: lib/string.c: implement stpcpy
LLVM implemented a recent "libcall optimization" that lowers calls to
`sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest,
str) - dest`. This generally avoids the machinery involved in parsing
format strings. `stpcpy` is just like `strcpy` except it returns the
pointer to the new tail of `dest`. This optimization was introduced into
clang-12.
Implement this so that we don't observe linkage failures due to missing
symbol definitions for `stpcpy`.
Similar to last year's fire drill with:
commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
The kernel is somewhere between a "freestanding" environment (no full
libc) and "hosted" environment (many symbols from libc exist with the same
type, function signature, and semantics).
As H. Peter Anvin notes, there's not really a great way to inform the
compiler that you're targeting a freestanding environment but would like
to opt-in to some libcall optimizations (see pr/47280 below), rather than
opt-out.
Arvind notes, -fno-builtin-* behaves slightly differently between GCC
and Clang, and Clang is missing many __builtin_* definitions, which I
consider a bug in Clang and am working on fixing.
Masahiro summarizes the subtle distinction between compilers justly:
To prevent transformation from foo() into bar(), there are two ways in
Clang to do that; -fno-builtin-foo, and -fno-builtin-bar. There is
only one in GCC; -fno-buitin-foo.
(Any difference in that behavior in Clang is likely a bug from a missing
__builtin_* definition.)
Masahiro also notes:
We want to disable optimization from foo() to bar(),
but we may still benefit from the optimization from
foo() into something else. If GCC implements the same transform, we
would run into a problem because it is not -fno-builtin-bar, but
-fno-builtin-foo that disables that optimization.
In this regard, -fno-builtin-foo would be more future-proof than
-fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
may want to prevent calls from foo() being optimized into calls to
bar(), but we still may want other optimization on calls to foo().
It seems that compilers today don't quite provide the fine grain control
over which libcall optimizations pseudo-freestanding environments would
prefer.
Finally, Kees notes that this interface is unsafe, so we should not
encourage its use. As such, I've removed the declaration from any
header, but it still needs to be exported to avoid linkage errors in
modules.
Link: https://lkml.kernel.org/r/20200914161643.938408-1-ndesaulniers@google.com
Link: https://bugs.llvm.org/show_bug.cgi?id=47162
Link: https://bugs.llvm.org/show_bug.cgi?id=47280
Link: https://github.com/ClangBuiltLinux/linux/issues/1126
Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
Link: https://reviews.llvm.org/D85963
Reported-by: Sami Tolvanen <samitolvanen(a)google.com>
Suggested-by: Andy Lavr <andy.lavr(a)gmail.com>
Suggested-by: Arvind Sankar <nivedita(a)alum.mit.edu>
Suggested-by: Joe Perches <joe(a)perches.com>
Suggested-by: Kees Cook <keescook(a)chromium.org>
Suggested-by: Masahiro Yamada <masahiroy(a)kernel.org>
Suggested-by: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/string.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
--- a/lib/string.c~lib-stringc-implement-stpcpy
+++ a/lib/string.c
@@ -272,6 +272,30 @@ ssize_t strscpy_pad(char *dest, const ch
}
EXPORT_SYMBOL(strscpy_pad);
+/**
+ * stpcpy - copy a string from src to dest returning a pointer to the new end
+ * of dest, including src's %NUL-terminator. May overrun dest.
+ * @dest: pointer to end of string being copied into. Must be large enough
+ * to receive copy.
+ * @src: pointer to the beginning of string being copied from. Must not overlap
+ * dest.
+ *
+ * stpcpy differs from strcpy in a key way: the return value is a pointer
+ * to the new %NUL-terminating character in @dest. (For strcpy, the return
+ * value is a pointer to the start of @dest). This interface is considered
+ * unsafe as it doesn't perform bounds checking of the inputs. As such it's
+ * not recommended for usage. Instead, its definition is provided in case
+ * the compiler lowers other libcalls to stpcpy.
+ */
+char *stpcpy(char *__restrict__ dest, const char *__restrict__ src);
+char *stpcpy(char *__restrict__ dest, const char *__restrict__ src)
+{
+ while ((*dest++ = *src++) != '\0')
+ /* nothing */;
+ return --dest;
+}
+EXPORT_SYMBOL(stpcpy);
+
#ifndef __HAVE_ARCH_STRCAT
/**
* strcat - Append one %NUL-terminated string to another
_
From: Vasily Gorbik <gor(a)linux.ibm.com>
Subject: mm/gup: fix gup_fast with dynamic page table folding
Currently to make sure that every page table entry is read just once
gup_fast walks perform READ_ONCE and pass pXd value down to the next
gup_pXd_range function by value e.g.:
static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
...
pudp = pud_offset(&p4d, addr);
This function passes a reference on that local value copy to pXd_offset,
and might get the very same pointer in return. This happens when the
level is folded (on most arches), and that pointer should not be iterated.
On s390 due to the fact that each task might have different 5,4 or 3-level
address translation and hence different levels folded the logic is more
complex and non-iteratable pointer to a local copy leads to severe
problems.
Here is an example of what happens with gup_fast on s390, for a task with
3-levels paging, crossing a 2 GB pud boundary:
// addr = 0x1007ffff000, end = 0x10080001000
static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
pud_t *pudp;
// pud_offset returns &p4d itself (a pointer to a value on stack)
pudp = pud_offset(&p4d, addr);
do {
// on second iteratation reading "random" stack value
pud_t pud = READ_ONCE(*pudp);
// next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390
next = pud_addr_end(addr, end);
...
} while (pudp++, addr = next, addr != end); // pudp++ iterating over stack
return 1;
}
This happens since s390 moved to common gup code with commit d1874a0c2805
("s390/mm: make the pxd_offset functions more robust") and commit
1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code").
s390 tried to mimic static level folding by changing pXd_offset
primitives to always calculate top level page table offset in pgd_offset
and just return the value passed when pXd_offset has to act as folded.
What is crucial for gup_fast and what has been overlooked is that
PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly.
And the latter is not possible with dynamic folding.
To fix the issue in addition to pXd values pass original pXdp pointers
down to gup_pXd_range functions. And introduce pXd_offset_lockless
helpers, which take an additional pXd entry value parameter. This has
already been discussed in
https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1
Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856…
Fixes: 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code")
Signed-off-by: Vasily Gorbik <gor(a)linux.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev(a)linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Russell King <linux(a)armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Cc: Paul Mackerras <paulus(a)samba.org>
Cc: Jeff Dike <jdike(a)addtoit.com>
Cc: Richard Weinberger <richard(a)nod.at>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Christian Borntraeger <borntraeger(a)de.ibm.com>
Cc: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org> [5.2+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/s390/include/asm/pgtable.h | 42 +++++++++++++++++++++---------
include/linux/pgtable.h | 10 +++++++
mm/gup.c | 18 ++++++------
3 files changed, 49 insertions(+), 21 deletions(-)
--- a/arch/s390/include/asm/pgtable.h~mm-gup-fix-gup_fast-with-dynamic-page-table-folding
+++ a/arch/s390/include/asm/pgtable.h
@@ -1260,26 +1260,44 @@ static inline pgd_t *pgd_offset_raw(pgd_
#define pgd_offset(mm, address) pgd_offset_raw(READ_ONCE((mm)->pgd), address)
-static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address)
+static inline p4d_t *p4d_offset_lockless(pgd_t *pgdp, pgd_t pgd, unsigned long address)
{
- if ((pgd_val(*pgd) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R1)
- return (p4d_t *) pgd_deref(*pgd) + p4d_index(address);
- return (p4d_t *) pgd;
+ if ((pgd_val(pgd) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R1)
+ return (p4d_t *) pgd_deref(pgd) + p4d_index(address);
+ return (p4d_t *) pgdp;
}
+#define p4d_offset_lockless p4d_offset_lockless
-static inline pud_t *pud_offset(p4d_t *p4d, unsigned long address)
+static inline p4d_t *p4d_offset(pgd_t *pgdp, unsigned long address)
{
- if ((p4d_val(*p4d) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R2)
- return (pud_t *) p4d_deref(*p4d) + pud_index(address);
- return (pud_t *) p4d;
+ return p4d_offset_lockless(pgdp, *pgdp, address);
+}
+
+static inline pud_t *pud_offset_lockless(p4d_t *p4dp, p4d_t p4d, unsigned long address)
+{
+ if ((p4d_val(p4d) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R2)
+ return (pud_t *) p4d_deref(p4d) + pud_index(address);
+ return (pud_t *) p4dp;
+}
+#define pud_offset_lockless pud_offset_lockless
+
+static inline pud_t *pud_offset(p4d_t *p4dp, unsigned long address)
+{
+ return pud_offset_lockless(p4dp, *p4dp, address);
}
#define pud_offset pud_offset
-static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
+static inline pmd_t *pmd_offset_lockless(pud_t *pudp, pud_t pud, unsigned long address)
+{
+ if ((pud_val(pud) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R3)
+ return (pmd_t *) pud_deref(pud) + pmd_index(address);
+ return (pmd_t *) pudp;
+}
+#define pmd_offset_lockless pmd_offset_lockless
+
+static inline pmd_t *pmd_offset(pud_t *pudp, unsigned long address)
{
- if ((pud_val(*pud) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R3)
- return (pmd_t *) pud_deref(*pud) + pmd_index(address);
- return (pmd_t *) pud;
+ return pmd_offset_lockless(pudp, *pudp, address);
}
#define pmd_offset pmd_offset
--- a/include/linux/pgtable.h~mm-gup-fix-gup_fast-with-dynamic-page-table-folding
+++ a/include/linux/pgtable.h
@@ -1427,6 +1427,16 @@ typedef unsigned int pgtbl_mod_mask;
#define mm_pmd_folded(mm) __is_defined(__PAGETABLE_PMD_FOLDED)
#endif
+#ifndef p4d_offset_lockless
+#define p4d_offset_lockless(pgdp, pgd, address) p4d_offset(&(pgd), address)
+#endif
+#ifndef pud_offset_lockless
+#define pud_offset_lockless(p4dp, p4d, address) pud_offset(&(p4d), address)
+#endif
+#ifndef pmd_offset_lockless
+#define pmd_offset_lockless(pudp, pud, address) pmd_offset(&(pud), address)
+#endif
+
/*
* p?d_leaf() - true if this entry is a final mapping to a physical address.
* This differs from p?d_huge() by the fact that they are always available (if
--- a/mm/gup.c~mm-gup-fix-gup_fast-with-dynamic-page-table-folding
+++ a/mm/gup.c
@@ -2485,13 +2485,13 @@ static int gup_huge_pgd(pgd_t orig, pgd_
return 1;
}
-static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
+static int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
pmd_t *pmdp;
- pmdp = pmd_offset(&pud, addr);
+ pmdp = pmd_offset_lockless(pudp, pud, addr);
do {
pmd_t pmd = READ_ONCE(*pmdp);
@@ -2528,13 +2528,13 @@ static int gup_pmd_range(pud_t pud, unsi
return 1;
}
-static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
+static int gup_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
pud_t *pudp;
- pudp = pud_offset(&p4d, addr);
+ pudp = pud_offset_lockless(p4dp, p4d, addr);
do {
pud_t pud = READ_ONCE(*pudp);
@@ -2549,20 +2549,20 @@ static int gup_pud_range(p4d_t p4d, unsi
if (!gup_huge_pd(__hugepd(pud_val(pud)), addr,
PUD_SHIFT, next, flags, pages, nr))
return 0;
- } else if (!gup_pmd_range(pud, addr, next, flags, pages, nr))
+ } else if (!gup_pmd_range(pudp, pud, addr, next, flags, pages, nr))
return 0;
} while (pudp++, addr = next, addr != end);
return 1;
}
-static int gup_p4d_range(pgd_t pgd, unsigned long addr, unsigned long end,
+static int gup_p4d_range(pgd_t *pgdp, pgd_t pgd, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
p4d_t *p4dp;
- p4dp = p4d_offset(&pgd, addr);
+ p4dp = p4d_offset_lockless(pgdp, pgd, addr);
do {
p4d_t p4d = READ_ONCE(*p4dp);
@@ -2574,7 +2574,7 @@ static int gup_p4d_range(pgd_t pgd, unsi
if (!gup_huge_pd(__hugepd(p4d_val(p4d)), addr,
P4D_SHIFT, next, flags, pages, nr))
return 0;
- } else if (!gup_pud_range(p4d, addr, next, flags, pages, nr))
+ } else if (!gup_pud_range(p4dp, p4d, addr, next, flags, pages, nr))
return 0;
} while (p4dp++, addr = next, addr != end);
@@ -2602,7 +2602,7 @@ static void gup_pgd_range(unsigned long
if (!gup_huge_pd(__hugepd(pgd_val(pgd)), addr,
PGDIR_SHIFT, next, flags, pages, nr))
return;
- } else if (!gup_p4d_range(pgd, addr, next, flags, pages, nr))
+ } else if (!gup_p4d_range(pgdp, pgd, addr, next, flags, pages, nr))
return;
} while (pgdp++, addr = next, addr != end);
}
_
From: Gao Xiang <hsiangkao(a)redhat.com>
Subject: mm, THP, swap: fix allocating cluster for swapfile by mistake
SWP_FS is used to make swap_{read,write}page() go through the filesystem,
and it's only used for swap files over NFS. So, !SWP_FS means non NFS for
now, it could be either file backed or device backed. Something similar
goes with legacy SWP_FILE.
So in order to achieve the goal of the original patch, SWP_BLKDEV should
be used instead.
FS corruption can be observed with SSD device + XFS + fragmented swapfile
due to CONFIG_THP_SWAP=y.
I reproduced the issue with the following details:
Environment:
QEMU + upstream kernel + buildroot + NVMe (2 GB)
Kernel config:
CONFIG_BLK_DEV_NVME=y
CONFIG_THP_SWAP=y
Some reproducable steps:
mkfs.xfs -f /dev/nvme0n1
mkdir /tmp/mnt
mount /dev/nvme0n1 /tmp/mnt
bs="32k"
sz="1024m" # doesn't matter too much, I also tried 16m
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -F -S 0 -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fsync" /tmp/mnt/sw
mkswap /tmp/mnt/sw
swapon /tmp/mnt/sw
stress --vm 2 --vm-bytes 600M # doesn't matter too much as well
Symptoms:
- FS corruption (e.g. checksum failure)
- memory corruption at: 0xd2808010
- segfault
Link: https://lkml.kernel.org/r/20200820045323.7809-1-hsiangkao@redhat.com
Fixes: f0eea189e8e9 ("mm, THP, swap: Don't allocate huge cluster for file backed swap device")
Fixes: 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out")
Signed-off-by: Gao Xiang <hsiangkao(a)redhat.com>
Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Acked-by: Rafael Aquini <aquini(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Carlos Maiolino <cmaiolino(a)redhat.com>
Cc: Eric Sandeen <esandeen(a)redhat.com>
Cc: Dave Chinner <david(a)fromorbit.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/swapfile.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/swapfile.c~mm-thp-swap-fix-allocating-cluster-for-swapfile-by-mistake
+++ a/mm/swapfile.c
@@ -1078,7 +1078,7 @@ start_over:
goto nextsi;
}
if (size == SWAPFILE_CLUSTER) {
- if (!(si->flags & SWP_FS))
+ if (si->flags & SWP_BLKDEV)
n_ret = swap_alloc_cluster(si, swp_entries);
} else
n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
_
The patch titled
Subject: mm/memory_hotplug: drain per-cpu pages again during memory offline
has been removed from the -mm tree. Its filename was
mm-memory_hotplug-drain-per-cpu-pages-again-during-memory-offline.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Subject: mm/memory_hotplug: drain per-cpu pages again during memory offline
There is a race during page offline that can lead to infinite loop:
a page never ends up on a buddy list and __offline_pages() keeps
retrying infinitely or until a termination signal is received.
Thread#1 - a new process:
load_elf_binary
begin_new_exec
exec_mmap
mmput
exit_mmap
tlb_finish_mmu
tlb_flush_mmu
release_pages
free_unref_page_list
free_unref_page_prepare
set_pcppage_migratetype(page, migratetype);
// Set page->index migration type below MIGRATE_PCPTYPES
Thread#2 - hot-removes memory
__offline_pages
start_isolate_page_range
set_migratetype_isolate
set_pageblock_migratetype(page, MIGRATE_ISOLATE);
Set migration type to MIGRATE_ISOLATE-> set
drain_all_pages(zone);
// drain per-cpu page lists to buddy allocator.
Thread#1 - continue
free_unref_page_commit
migratetype = get_pcppage_migratetype(page);
// get old migration type
list_add(&page->lru, &pcp->lists[migratetype]);
// add new page to already drained pcp list
Thread#2
Never drains pcp again, and therefore gets stuck in the loop.
The fix is to try to drain per-cpu lists again after
check_pages_isolated_cb() fails.
Link: https://lkml.kernel.org/r/20200903140032.380431-1-pasha.tatashin@soleen.com
Link: https://lkml.kernel.org/r/20200904151448.100489-2-pasha.tatashin@soleen.com
Link: http://lkml.kernel.org/r/20200904070235.GA15277@dhcp22.suse.cz
Fixes: c52e75935f8d ("mm: remove extra drain pages on pcp list")
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory_hotplug.c | 14 ++++++++++++++
mm/page_isolation.c | 8 ++++++++
2 files changed, 22 insertions(+)
--- a/mm/memory_hotplug.c~mm-memory_hotplug-drain-per-cpu-pages-again-during-memory-offline
+++ a/mm/memory_hotplug.c
@@ -1575,6 +1575,20 @@ static int __ref __offline_pages(unsigne
/* check again */
ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
NULL, check_pages_isolated_cb);
+ /*
+ * per-cpu pages are drained in start_isolate_page_range, but if
+ * there are still pages that are not free, make sure that we
+ * drain again, because when we isolated range we might
+ * have raced with another thread that was adding pages to pcp
+ * list.
+ *
+ * Forward progress should be still guaranteed because
+ * pages on the pcp list can only belong to MOVABLE_ZONE
+ * because has_unmovable_pages explicitly checks for
+ * PageBuddy on freed pages on other zones.
+ */
+ if (ret)
+ drain_all_pages(zone);
} while (ret);
/* Ok, all of our target is isolated.
--- a/mm/page_isolation.c~mm-memory_hotplug-drain-per-cpu-pages-again-during-memory-offline
+++ a/mm/page_isolation.c
@@ -170,6 +170,14 @@ __first_valid_page(unsigned long pfn, un
* pageblocks we may have modified and return -EBUSY to caller. This
* prevents two threads from simultaneously working on overlapping ranges.
*
+ * Please note that there is no strong synchronization with the page allocator
+ * either. Pages might be freed while their page blocks are marked ISOLATED.
+ * In some cases pages might still end up on pcp lists and that would allow
+ * for their allocation even when they are in fact isolated already. Depending
+ * on how strong of a guarantee the caller needs drain_all_pages might be needed
+ * (e.g. __offline_pages will need to call it after check for isolated range for
+ * a next retry).
+ *
* Return: the number of isolated pageblocks on success and -EBUSY if any part
* of range cannot be isolated.
*/
_
Patches currently in -mm which might be from pasha.tatashin(a)soleen.com are
The patch titled
Subject: selftests/vm: fix display of page size in map_hugetlb
has been removed from the -mm tree. Its filename was
selftests-vm-fix-display-of-page-size-in-map_hugetlb.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Subject: selftests/vm: fix display of page size in map_hugetlb
The displayed size is in bytes while the text says it is in kB.
Shift it by 10 to really display kBytes.
Link: https://lkml.kernel.org/r/e27481224564a93d14106e750de31189deaa8bc8.15988619…
Fixes: fa7b9a805c79 ("tools/selftest/vm: allow choosing mem size and page size in map_hugetlb")
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/vm/map_hugetlb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/vm/map_hugetlb.c~selftests-vm-fix-display-of-page-size-in-map_hugetlb
+++ a/tools/testing/selftests/vm/map_hugetlb.c
@@ -83,7 +83,7 @@ int main(int argc, char **argv)
}
if (shift)
- printf("%u kB hugepages\n", 1 << shift);
+ printf("%u kB hugepages\n", 1 << (shift - 10));
else
printf("Default size hugepages\n");
printf("Mapping %lu Mbytes\n", (unsigned long)length >> 20);
_
Patches currently in -mm which might be from christophe.leroy(a)csgroup.eu are