This is a note to let you know that I've just added the patch titled
iio: adc: cpcap: fix incorrect validation
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 81b039ec36a41a5451e1e36f05bb055eceab1dc8 Mon Sep 17 00:00:00 2001
From: Pan Bian <bianpan2016(a)163.com>
Date: Mon, 13 Nov 2017 00:01:20 +0800
Subject: iio: adc: cpcap: fix incorrect validation
Function platform_get_irq_byname() returns a negative error code on
failure, and a zero or positive number on success. However, in function
cpcap_adc_probe(), positive IRQ numbers are also taken as error cases.
Use "if (ddata->irq < 0)" instead of "if (!ddata->irq)" to validate the
return value of platform_get_irq_byname().
Signed-off-by: Pan Bian <bianpan2016(a)163.com>
Fixes: 25ec249632d50 ("iio: adc: cpcap: Add minimal support for CPCAP PMIC ADC")
Reviewed-by: Sebastian Reichel <sebastian.reichel(a)collabora.co.uk>
Acked-by: Tony Lindgren <tony(a)atomide.com>
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/cpcap-adc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/adc/cpcap-adc.c b/drivers/iio/adc/cpcap-adc.c
index 3576ec73ec23..9ad60421d360 100644
--- a/drivers/iio/adc/cpcap-adc.c
+++ b/drivers/iio/adc/cpcap-adc.c
@@ -1011,7 +1011,7 @@ static int cpcap_adc_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, indio_dev);
ddata->irq = platform_get_irq_byname(pdev, "adcdone");
- if (!ddata->irq)
+ if (ddata->irq < 0)
return -ENODEV;
error = devm_request_threaded_irq(&pdev->dev, ddata->irq, NULL,
--
2.15.1
backup_info field is only allocated for decrypt code path.
The field was not nullified when not used causing a kfree
in an error handling path to attempt to free random
addresses as uncovered in stress testing.
Fixes: 737aed947f9b ("staging: ccree: save ciphertext for CTS IV")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gilad Ben-Yossef <gilad(a)benyossef.com>
---
drivers/staging/ccree/ssi_cipher.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/staging/ccree/ssi_cipher.c b/drivers/staging/ccree/ssi_cipher.c
index 9019615..7b484f1 100644
--- a/drivers/staging/ccree/ssi_cipher.c
+++ b/drivers/staging/ccree/ssi_cipher.c
@@ -907,6 +907,7 @@ static int ssi_ablkcipher_encrypt(struct ablkcipher_request *req)
unsigned int ivsize = crypto_ablkcipher_ivsize(ablk_tfm);
req_ctx->is_giv = false;
+ req_ctx->backup_info = NULL;
return ssi_blkcipher_process(tfm, req_ctx, req->dst, req->src,
req->nbytes, req->info, ivsize,
--
2.7.4
On Tue, Sep 5, 2017 at 1:06 PM, Marcin Nowakowski
<marcin.nowakowski(a)imgtec.com> wrote:
> Change 73fbc1eba7ff added a fix to ensure that the memory range between
> PHYS_OFFSET and low memory address specified by mem= cmdline argument is
> not later processed by free_all_bootmem.
> This change was incorrect for systems where the commandline specifies
> more than 1 mem argument, as it will cause all memory between
> PHYS_OFFSET and each of the memory offsets to be marked as reserved,
> which results in parts of the RAM marked as reserved (Creator CI20's
> u-boot has a default commandline argument 'mem=256M@0x0
> mem=768M@0x30000000').
>
> Change the behaviour to ensure that only the range between PHYS_OFFSET
> and the lowest start address of the memories is marked as protected.
>
> This change also ensures that the range is marked protected even if it's
> only defined through the devicetree and not only via commandline
> arguments.
>
> Reported-by: Mathieu Malaterre <mathieu.malaterre(a)gmail.com>
> Signed-off-by: Marcin Nowakowski <marcin.nowakowski(a)imgtec.com>
> Fixes: 73fbc1eba7ff ("MIPS: fix mem=X@Y commandline processing")
> Cc: stable(a)vger.kernel.org
> ---
> arch/mips/kernel/setup.c | 19 ++++++++++++++++---
> 1 file changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
> index fe39397..a1c39ec 100644
> --- a/arch/mips/kernel/setup.c
> +++ b/arch/mips/kernel/setup.c
> @@ -374,6 +374,7 @@ static void __init bootmem_init(void)
> unsigned long reserved_end;
> unsigned long mapstart = ~0UL;
> unsigned long bootmap_size;
> + phys_addr_t ramstart = ~0UL;
> bool bootmap_valid = false;
> int i;
>
> @@ -394,6 +395,21 @@ static void __init bootmem_init(void)
> max_low_pfn = 0;
>
> /*
> + * Reserve any memory between the start of RAM and PHYS_OFFSET
> + */
> + for (i = 0; i < boot_mem_map.nr_map; i++) {
> + if (boot_mem_map.map[i].type != BOOT_MEM_RAM)
> + continue;
> +
> + ramstart = min(ramstart, boot_mem_map.map[i].addr);
> + }
> +
> + if (ramstart > PHYS_OFFSET)
> + add_memory_region(PHYS_OFFSET, ramstart - PHYS_OFFSET,
> + BOOT_MEM_RESERVED);
> +
> +
> + /*
> * Find the highest page frame number we have available.
> */
> for (i = 0; i < boot_mem_map.nr_map; i++) {
> @@ -663,9 +679,6 @@ static int __init early_parse_mem(char *p)
>
> add_memory_region(start, size, BOOT_MEM_RAM);
>
> - if (start && start > PHYS_OFFSET)
> - add_memory_region(PHYS_OFFSET, start - PHYS_OFFSET,
> - BOOT_MEM_RESERVED);
> return 0;
> }
> early_param("mem", early_parse_mem);
> --
> 2.7.4
>
>
Teested-by: Mathieu Malaterre <malat(a)debian.org>
Would be nice to have it upstream at some point...
Thanks
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit cebe84c6190d741045a322f5343f717139993c08 ]
Now when ip route flush cache and it turn out all fnhe_genid != genid.
If a redirect/pmtu icmp packet comes and the old fnhe is found and all
it's members but fnhe_genid will be updated.
Then next time when it looks up route and tries to rebind this fnhe to
the new dst, the fnhe will be flushed due to fnhe_genid != genid. It
causes this redirect/pmtu icmp packet acutally not to be applied.
This patch is to also reset fnhe_genid when updating a route cache.
Fixes: 5aad1de5ea2c ("ipv4: use separate genid for next hop exceptions")
Acked-by: Hannes Frederic Sowa <hannes(a)stressinduktion.org>
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
---
net/ipv4/route.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 2fe9459fbe24..a04d29d70e6a 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -622,9 +622,12 @@ static void update_or_create_fnhe(struct fib_nh *nh, __be32 daddr, __be32 gw,
struct fnhe_hash_bucket *hash;
struct fib_nh_exception *fnhe;
struct rtable *rt;
+ u32 genid, hval;
unsigned int i;
int depth;
- u32 hval = fnhe_hashfun(daddr);
+
+ genid = fnhe_genid(dev_net(nh->nh_dev));
+ hval = fnhe_hashfun(daddr);
spin_lock_bh(&fnhe_lock);
@@ -647,6 +650,8 @@ static void update_or_create_fnhe(struct fib_nh *nh, __be32 daddr, __be32 gw,
}
if (fnhe) {
+ if (fnhe->fnhe_genid != genid)
+ fnhe->fnhe_genid = genid;
if (gw)
fnhe->fnhe_gw = gw;
if (pmtu) {
@@ -671,7 +676,7 @@ static void update_or_create_fnhe(struct fib_nh *nh, __be32 daddr, __be32 gw,
fnhe->fnhe_next = hash->chain;
rcu_assign_pointer(hash->chain, fnhe);
}
- fnhe->fnhe_genid = fnhe_genid(dev_net(nh->nh_dev));
+ fnhe->fnhe_genid = genid;
fnhe->fnhe_daddr = daddr;
fnhe->fnhe_gw = gw;
fnhe->fnhe_pmtu = pmtu;
--
2.11.0
From: Masahiro Yamada <yamada.masahiro(a)socionext.com>
[ Upstream commit 2dbc644ac62bbcb9ee78e84719953f611be0413d ]
For rpm-pkg and deb-pkg, a source tar file is created. All paths in
the archive must be prefixed with the base name of the tar so that
everything is contained in the directory when you extract it.
Currently, scripts/package/Makefile uses a symlink for that, and
removes it after the tar is created.
If you terminate the build during the tar creation, the symlink is
left over. Then, at the next package build, you will see a warning
like follows:
ln: '.' and 'kernel-4.14.0+/.' are the same file
It is possible to fix it by adding -n (--no-dereference) option to
the "ln" command, but a cleaner way is to use --transform option
of "tar" command. This option is GNU extension, but it should not
hurt to use it in the Linux build system.
The 'S' flag is needed to exclude symlinks from the path fixup.
Without it, symlinks in the kernel are broken.
Signed-off-by: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
---
scripts/package/Makefile | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/scripts/package/Makefile b/scripts/package/Makefile
index 493e226356ca..52917fb8e0c5 100644
--- a/scripts/package/Makefile
+++ b/scripts/package/Makefile
@@ -39,10 +39,9 @@ if test "$(objtree)" != "$(srctree)"; then \
false; \
fi ; \
$(srctree)/scripts/setlocalversion --save-scmversion; \
-ln -sf $(srctree) $(2); \
tar -cz $(RCS_TAR_IGNORE) -f $(2).tar.gz \
- $(addprefix $(2)/,$(TAR_CONTENT) $(3)); \
-rm -f $(2) $(objtree)/.scmversion
+ --transform 's:^:$(2)/:S' $(TAR_CONTENT) $(3); \
+rm -f $(objtree)/.scmversion
# rpm-pkg
# ---------------------------------------------------------------------------
--
2.11.0
From: Masahiro Yamada <yamada.masahiro(a)socionext.com>
[ Upstream commit 2dbc644ac62bbcb9ee78e84719953f611be0413d ]
For rpm-pkg and deb-pkg, a source tar file is created. All paths in
the archive must be prefixed with the base name of the tar so that
everything is contained in the directory when you extract it.
Currently, scripts/package/Makefile uses a symlink for that, and
removes it after the tar is created.
If you terminate the build during the tar creation, the symlink is
left over. Then, at the next package build, you will see a warning
like follows:
ln: '.' and 'kernel-4.14.0+/.' are the same file
It is possible to fix it by adding -n (--no-dereference) option to
the "ln" command, but a cleaner way is to use --transform option
of "tar" command. This option is GNU extension, but it should not
hurt to use it in the Linux build system.
The 'S' flag is needed to exclude symlinks from the path fixup.
Without it, symlinks in the kernel are broken.
Signed-off-by: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
---
scripts/package/Makefile | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/scripts/package/Makefile b/scripts/package/Makefile
index 71b4a8af9d4d..7badec3498b8 100644
--- a/scripts/package/Makefile
+++ b/scripts/package/Makefile
@@ -39,10 +39,9 @@ if test "$(objtree)" != "$(srctree)"; then \
false; \
fi ; \
$(srctree)/scripts/setlocalversion --save-scmversion; \
-ln -sf $(srctree) $(2); \
tar -cz $(RCS_TAR_IGNORE) -f $(2).tar.gz \
- $(addprefix $(2)/,$(TAR_CONTENT) $(3)); \
-rm -f $(2) $(objtree)/.scmversion
+ --transform 's:^:$(2)/:S' $(TAR_CONTENT) $(3); \
+rm -f $(objtree)/.scmversion
# rpm-pkg
# ---------------------------------------------------------------------------
--
2.11.0
From: Colin Ian King <colin.king(a)canonical.com>
[ Upstream commit e9990d70e8a063a7b894c5cbb99f630a0f41200d ]
The comparison of u32 nregs being less than zero is never true since
nregs is unsigned. Fix this by making nregs a signed integer.
Fixes: f20cc9b00c7b ("irqchip/qcom: Add IRQ combiner driver")
Signed-off-by: Colin Ian King <colin.king(a)canonical.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Marc Zyngier <marc.zyngier(a)arm.com>
Cc: kernel-janitors(a)vger.kernel.org
Cc: Jason Cooper <jason(a)lakedaemon.net>
Link: https://lkml.kernel.org/r/20171117183553.2739-1-colin.king@canonical.com
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
---
drivers/irqchip/qcom-irq-combiner.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/qcom-irq-combiner.c b/drivers/irqchip/qcom-irq-combiner.c
index 6aa3ea479214..f31265937439 100644
--- a/drivers/irqchip/qcom-irq-combiner.c
+++ b/drivers/irqchip/qcom-irq-combiner.c
@@ -238,7 +238,7 @@ static int __init combiner_probe(struct platform_device *pdev)
{
struct combiner *combiner;
size_t alloc_sz;
- u32 nregs;
+ int nregs;
int err;
nregs = count_registers(pdev);
--
2.11.0
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi Greg,
Pleae pull commits for Linux 3.18 .
I've sent a review request for all commits over a week ago and all
comments were addressed.
Thanks,
Sasha
=====
The following changes since commit b42518053ffd221d79cff2df8c0257db88a71334:
Linux 3.18.85 (2017-11-30 08:35:56 +0000)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git for-greg/4.14/3.18
for you to fetch changes up to 90cbc83fe2279c1f0b5e94196c27253513405d77:
perf test attr: Fix ignored test case result (2017-11-30 17:01:13 -0500)
- ----------------------------------------------------------------
Ben Hutchings (1):
usbip: tools: Install all headers needed for libusbip development
Boshi Wang (1):
ima: fix hash algorithm initialization
Gustavo A. R. Silva (1):
EDAC, sb_edac: Fix missing break in switch
Hiromitsu Yamasaki (1):
spi: sh-msiof: Fix DMA transfer size check
Jibin Xu (1):
sysrq : fix Show Regs call trace on ARM
Lukas Wunner (1):
serial: 8250_fintek: Fix rs485 disablement on invalid ioctl()
Masami Hiramatsu (1):
kprobes: Use synchronize_rcu_tasks() for optprobe with CONFIG_PREEMPT=y
Thomas Richter (1):
perf test attr: Fix ignored test case result
arch/Kconfig | 2 +-
drivers/edac/sb_edac.c | 1 +
drivers/spi/spi-sh-msiof.c | 2 +-
drivers/tty/serial/8250/8250_fintek.c | 2 +-
drivers/tty/sysrq.c | 9 +++++++--
kernel/kprobes.c | 14 ++++++++------
security/integrity/ima/ima_main.c | 4 ++++
tools/perf/tests/attr.c | 2 +-
tools/usb/usbip/Makefile.am | 3 ++-
9 files changed, 26 insertions(+), 13 deletions(-)
-----BEGIN PGP SIGNATURE-----
iQIcBAEBCAAGBQJaIIAGAAoJEN6mb/eXdyzcB/AP/2JM6q4DXlQNiG6gFgU1MAIR
N6uwbY9a+RRE/P2sSb0fTx+/f2JTdRhY/NK15y2up+vWxOAzBmqE/NotO0gzaMc1
a+8LkomLf9BS4I/ScRb/iAANUBwWcZNPFHuiYRNGg5y0Ru4uI2C4LNpmNXE6uhWC
ia7PjeElSbVZptmMAk1Fdb3A+eYZDwZUvBjf1Zcsjd3QYCS2psa/xQu+uvv4CsNX
1UlGnVqF6hIqrpO1JYgS1vRuGSsSdLiPVrFdVha9Npg6ORtBk6Be6y0vq7uHmvqD
x0+iFimMTadSRLPWW5u5CmXsDdhgXb3+5PPl8Bw4krJ7RX9/qWLWmyXJ3pGaZr7U
81LFws1Kt7sG+aDwCSq48rhUnbQAEardI16H9abujM3sCe3b0uboLm8PV9K7JlpZ
xiK5lo3U1ia2v1TKUeFS2KhdQtnV0YzHpfIDS9TDQk28kFFtkKDek3x8xM74Z6aa
xXtDi4qAJpa/73V9cr9DQM0UuavAk+3QDFEZras8C0dkMqi/DIEvpvZUQrvBMj3t
bCldw32ylhtg+xaKPNUO7CXoRoKmE6b2kAed0PJ9sZIXYwoS3PGZnvGUDXDNqOY1
Mad0Mhh8JDiEBtBpEuIQzSVSTRNIeOqGYS1RiCN0oPI0uY/3xWZ7NV2bj+KXAqOe
66nYqLJq2Ai7Dgx4l6Es
=eGmn
-----END PGP SIGNATURE-----
This is a note to let you know that I've just added the patch titled
mm, oom_reaper: gather each vma to prevent leaking TLB entry
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-oom_reaper-gather-each-vma-to-prevent-leaking-tlb-entry.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 687cb0884a714ff484d038e9190edc874edcf146 Mon Sep 17 00:00:00 2001
From: Wang Nan <wangnan0(a)huawei.com>
Date: Wed, 29 Nov 2017 16:09:58 -0800
Subject: mm, oom_reaper: gather each vma to prevent leaking TLB entry
From: Wang Nan <wangnan0(a)huawei.com>
commit 687cb0884a714ff484d038e9190edc874edcf146 upstream.
tlb_gather_mmu(&tlb, mm, 0, -1) means gathering the whole virtual memory
space. In this case, tlb->fullmm is true. Some archs like arm64
doesn't flush TLB when tlb->fullmm is true:
commit 5a7862e83000 ("arm64: tlbflush: avoid flushing when fullmm == 1").
Which causes leaking of tlb entries.
Will clarifies his patch:
"Basically, we tag each address space with an ASID (PCID on x86) which
is resident in the TLB. This means we can elide TLB invalidation when
pulling down a full mm because we won't ever assign that ASID to
another mm without doing TLB invalidation elsewhere (which actually
just nukes the whole TLB).
I think that means that we could potentially not fault on a kernel
uaccess, because we could hit in the TLB"
There could be a window between complete_signal() sending IPI to other
cores and all threads sharing this mm are really kicked off from cores.
In this window, the oom reaper may calls tlb_flush_mmu_tlbonly() to
flush TLB then frees pages. However, due to the above problem, the TLB
entries are not really flushed on arm64. Other threads are possible to
access these pages through TLB entries. Moreover, a copy_to_user() can
also write to these pages without generating page fault, causes
use-after-free bugs.
This patch gathers each vma instead of gathering full vm space. In this
case tlb->fullmm is not true. The behavior of oom reaper become similar
to munmapping before do_exit, which should be safe for all archs.
Link: http://lkml.kernel.org/r/20171107095453.179940-1-wangnan0@huawei.com
Fixes: aac453635549 ("mm, oom: introduce oom reaper")
Signed-off-by: Wang Nan <wangnan0(a)huawei.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: Bob Liu <liubo95(a)huawei.com>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Roman Gushchin <guro(a)fb.com>
Cc: Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/oom_kill.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -532,7 +532,6 @@ static bool __oom_reap_task_mm(struct ta
*/
set_bit(MMF_UNSTABLE, &mm->flags);
- tlb_gather_mmu(&tlb, mm, 0, -1);
for (vma = mm->mmap ; vma; vma = vma->vm_next) {
if (!can_madv_dontneed_vma(vma))
continue;
@@ -547,11 +546,13 @@ static bool __oom_reap_task_mm(struct ta
* we do not want to block exit_mmap by keeping mm ref
* count elevated without a good reason.
*/
- if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED))
+ if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED)) {
+ tlb_gather_mmu(&tlb, mm, vma->vm_start, vma->vm_end);
unmap_page_range(&tlb, vma, vma->vm_start, vma->vm_end,
NULL);
+ tlb_finish_mmu(&tlb, vma->vm_start, vma->vm_end);
+ }
}
- tlb_finish_mmu(&tlb, 0, -1);
pr_info("oom_reaper: reaped process %d (%s), now anon-rss:%lukB, file-rss:%lukB, shmem-rss:%lukB\n",
task_pid_nr(tsk), tsk->comm,
K(get_mm_counter(mm, MM_ANONPAGES)),
Patches currently in stable-queue which might be from wangnan0(a)huawei.com are
queue-4.14/mm-oom_reaper-gather-each-vma-to-prevent-leaking-tlb-entry.patch
This is a note to let you know that I've just added the patch titled
mm, memory_hotplug: do not back off draining pcp free pages from kworker context
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-memory_hotplug-do-not-back-off-draining-pcp-free-pages-from-kworker-context.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 4b81cb2ff69c8a8e297a147d2eb4d9b5e8d7c435 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko(a)suse.com>
Date: Wed, 29 Nov 2017 16:09:54 -0800
Subject: mm, memory_hotplug: do not back off draining pcp free pages from kworker context
From: Michal Hocko <mhocko(a)suse.com>
commit 4b81cb2ff69c8a8e297a147d2eb4d9b5e8d7c435 upstream.
drain_all_pages backs off when called from a kworker context since
commit 0ccce3b92421 ("mm, page_alloc: drain per-cpu pages from workqueue
context") because the original IPI based pcp draining has been replaced
by a WQ based one and the check wanted to prevent from recursion and
inter workers dependencies. This has made some sense at the time
because the system WQ has been used and one worker holding the lock
could be blocked while waiting for new workers to emerge which can be a
problem under OOM conditions.
Since then commit ce612879ddc7 ("mm: move pcp and lru-pcp draining into
single wq") has moved draining to a dedicated (mm_percpu_wq) WQ with a
rescuer so we shouldn't depend on any other WQ activity to make a
forward progress so calling drain_all_pages from a worker context is
safe as long as this doesn't happen from mm_percpu_wq itself which is
not the case because all workers are required to _not_ depend on any MM
locks.
Why is this a problem in the first place? ACPI driven memory hot-remove
(acpi_device_hotplug) is executed from the worker context. We end up
calling __offline_pages to free all the pages and that requires both
lru_add_drain_all_cpuslocked and drain_all_pages to do their job
otherwise we can have dangling pages on pcp lists and fail the offline
operation (__test_page_isolated_in_pageblock would see a page with 0 ref
count but without PageBuddy set).
Fix the issue by removing the worker check in drain_all_pages.
lru_add_drain_all_cpuslocked doesn't have this restriction so it works
as expected.
Link: http://lkml.kernel.org/r/20170828093341.26341-1-mhocko@kernel.org
Fixes: 0ccce3b924212 ("mm, page_alloc: drain per-cpu pages from workqueue context")
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Tejun Heo <tj(a)kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/page_alloc.c | 4 ----
1 file changed, 4 deletions(-)
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2487,10 +2487,6 @@ void drain_all_pages(struct zone *zone)
if (WARN_ON_ONCE(!mm_percpu_wq))
return;
- /* Workqueues cannot recurse */
- if (current->flags & PF_WQ_WORKER)
- return;
-
/*
* Do not drain if one is already in progress unless it's specific to
* a zone. Such callers are primarily CMA and memory hotplug and need
Patches currently in stable-queue which might be from mhocko(a)suse.com are
queue-4.14/mm-oom_reaper-gather-each-vma-to-prevent-leaking-tlb-entry.patch
queue-4.14/mm-memory_hotplug-do-not-back-off-draining-pcp-free-pages-from-kworker-context.patch