From: Huang Ying <ying.huang(a)intel.com>
It was reported by Sergey Senozhatsky that if THP (Transparent Huge
Page) and frontswap (via zswap) are both enabled, when memory goes low
so that swap is triggered, segfault and memory corruption will occur
in random user space applications as follow,
kernel: urxvt[338]: segfault at 20 ip 00007fc08889ae0d sp 00007ffc73a7fc40 error 6 in libc-2.26.so[7fc08881a000+1ae000]
#0 0x00007fc08889ae0d _int_malloc (libc.so.6)
#1 0x00007fc08889c2f3 malloc (libc.so.6)
#2 0x0000560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt)
#3 0x0000560e6005e75c n/a (urxvt)
#4 0x0000560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez (urxvt)
#5 0x0000560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt)
#6 0x0000560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt)
#7 0x0000560e6005c10f _Z17ev_invoke_pendingv (urxvt)
#8 0x0000560e6005cb55 ev_run (urxvt)
#9 0x0000560e6003b9b9 main (urxvt)
#10 0x00007fc08883af4a __libc_start_main (libc.so.6)
#11 0x0000560e6003f9da _start (urxvt)
After bisection, it was found the first bad commit is
bd4c82c22c367e068 ("mm, THP, swap: delay splitting THP after swapped
out").
The root cause is as follow.
When the pages are written to storage device during swapping out in
swap_writepage(), zswap (fontswap) is tried to compress the pages
instead to improve the performance. But zswap (frontswap) will treat
THP as normal page, so only the head page is saved. After swapping
in, tail pages will not be restored to its original contents, so cause
the memory corruption in the applications.
This is fixed via splitting THP at the begin of swapping out if
frontswap is enabled. To avoid frontswap to be enabled at runtime,
whether the page is THP is checked before using frontswap during
swapping out too.
Reported-and-tested-by: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
Signed-off-by: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: Dan Streetman <ddstreet(a)ieee.org>
Cc: Seth Jennings <sjenning(a)redhat.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Cc: Shaohua Li <shli(a)kernel.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: stable(a)vger.kernel.org # 4.14
Fixes: bd4c82c22c367e068 ("mm, THP, swap: delay splitting THP after swapped out")
---
mm/page_io.c | 2 +-
mm/vmscan.c | 16 +++++++++++++---
2 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/mm/page_io.c b/mm/page_io.c
index b41cf9644585..6dca817ae7a0 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -250,7 +250,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc)
unlock_page(page);
goto out;
}
- if (frontswap_store(page) == 0) {
+ if (!PageTransHuge(page) && frontswap_store(page) == 0) {
set_page_writeback(page);
unlock_page(page);
end_page_writeback(page);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index bee53495a829..d1c1e00b08bb 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -55,6 +55,7 @@
#include <linux/swapops.h>
#include <linux/balloon_compaction.h>
+#include <linux/frontswap.h>
#include "internal.h"
@@ -1063,14 +1064,23 @@ static unsigned long shrink_page_list(struct list_head *page_list,
/* cannot split THP, skip it */
if (!can_split_huge_page(page, NULL))
goto activate_locked;
+ /*
+ * Split THP if frontswap enabled,
+ * because it cannot process THP
+ */
+ if (frontswap_enabled()) {
+ if (split_huge_page_to_list(
+ page, page_list))
+ goto activate_locked;
+ }
/*
* Split pages without a PMD map right
* away. Chances are some or all of the
* tail pages can be freed without IO.
*/
- if (!compound_mapcount(page) &&
- split_huge_page_to_list(page,
- page_list))
+ else if (!compound_mapcount(page) &&
+ split_huge_page_to_list(page,
+ page_list))
goto activate_locked;
}
if (!add_to_swap(page)) {
--
2.15.1
From: Eric Biggers <ebiggers(a)google.com>
The asymmetric key type allows an X.509 certificate to be added even if
its signature's hash algorithm is not available in the crypto API. In
that case 'payload.data[asym_auth]' will be NULL. But the key
restriction code failed to check for this case before trying to use the
signature, resulting in a NULL pointer dereference in
key_or_keyring_common() or in restrict_link_by_signature().
Fix this by returning -ENOPKG when the signature is unsupported.
Reproducer when all the CONFIG_CRYPTO_SHA512* options are disabled and
keyctl has support for the 'restrict_keyring' command:
keyctl new_session
keyctl restrict_keyring @s asymmetric builtin_trusted
openssl req -new -sha512 -x509 -batch -nodes -outform der \
| keyctl padd asymmetric desc @s
Fixes: a511e1af8b12 ("KEYS: Move the point of trust determination to __key_link()")
Cc: <stable(a)vger.kernel.org> # v4.7+
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
crypto/asymmetric_keys/restrict.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/crypto/asymmetric_keys/restrict.c b/crypto/asymmetric_keys/restrict.c
index 86fb68508952..7c93c7728454 100644
--- a/crypto/asymmetric_keys/restrict.c
+++ b/crypto/asymmetric_keys/restrict.c
@@ -67,8 +67,9 @@ __setup("ca_keys=", ca_keys_setup);
*
* Returns 0 if the new certificate was accepted, -ENOKEY if we couldn't find a
* matching parent certificate in the trusted list, -EKEYREJECTED if the
- * signature check fails or the key is blacklisted and some other error if
- * there is a matching certificate but the signature check cannot be performed.
+ * signature check fails or the key is blacklisted, -ENOPKG if the signature
+ * uses unsupported crypto, or some other error if there is a matching
+ * certificate but the signature check cannot be performed.
*/
int restrict_link_by_signature(struct key *dest_keyring,
const struct key_type *type,
@@ -88,6 +89,8 @@ int restrict_link_by_signature(struct key *dest_keyring,
return -EOPNOTSUPP;
sig = payload->data[asym_auth];
+ if (!sig)
+ return -ENOPKG;
if (!sig->auth_ids[0] && !sig->auth_ids[1])
return -ENOKEY;
@@ -139,6 +142,8 @@ static int key_or_keyring_common(struct key *dest_keyring,
return -EOPNOTSUPP;
sig = payload->data[asym_auth];
+ if (!sig)
+ return -ENOPKG;
if (!sig->auth_ids[0] && !sig->auth_ids[1])
return -ENOKEY;
@@ -222,9 +227,9 @@ static int key_or_keyring_common(struct key *dest_keyring,
*
* Returns 0 if the new certificate was accepted, -ENOKEY if we
* couldn't find a matching parent certificate in the trusted list,
- * -EKEYREJECTED if the signature check fails, and some other error if
- * there is a matching certificate but the signature check cannot be
- * performed.
+ * -EKEYREJECTED if the signature check fails, -ENOPKG if the signature uses
+ * unsupported crypto, or some other error if there is a matching certificate
+ * but the signature check cannot be performed.
*/
int restrict_link_by_key_or_keyring(struct key *dest_keyring,
const struct key_type *type,
@@ -249,9 +254,9 @@ int restrict_link_by_key_or_keyring(struct key *dest_keyring,
*
* Returns 0 if the new certificate was accepted, -ENOKEY if we
* couldn't find a matching parent certificate in the trusted list,
- * -EKEYREJECTED if the signature check fails, and some other error if
- * there is a matching certificate but the signature check cannot be
- * performed.
+ * -EKEYREJECTED if the signature check fails, -ENOPKG if the signature uses
+ * unsupported crypto, or some other error if there is a matching certificate
+ * but the signature check cannot be performed.
*/
int restrict_link_by_key_or_keyring_chain(struct key *dest_keyring,
const struct key_type *type,
--
2.16.0.rc1.238.g530d649a79-goog
From: Eric Biggers <ebiggers(a)google.com>
If there is a blacklisted certificate in a SignerInfo's certificate
chain, then pkcs7_verify_sig_chain() sets sinfo->blacklisted and returns
0. But, pkcs7_verify() fails to handle this case appropriately, as it
actually continues on to the line 'actual_ret = 0;', indicating that the
SignerInfo has passed verification. Consequently, PKCS#7 signature
verification ignores the certificate blacklist.
Fix this by not considering blacklisted SignerInfos to have passed
verification.
Also fix the function comment with regards to when 0 is returned.
Fixes: 03bb79315ddc ("PKCS#7: Handle blacklisted certificates")
Cc: <stable(a)vger.kernel.org> # v4.12+
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
crypto/asymmetric_keys/pkcs7_verify.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/crypto/asymmetric_keys/pkcs7_verify.c b/crypto/asymmetric_keys/pkcs7_verify.c
index 2f6a768b91d7..97c77f66b20d 100644
--- a/crypto/asymmetric_keys/pkcs7_verify.c
+++ b/crypto/asymmetric_keys/pkcs7_verify.c
@@ -366,8 +366,7 @@ static int pkcs7_verify_one(struct pkcs7_message *pkcs7,
*
* (*) -EBADMSG if some part of the message was invalid, or:
*
- * (*) 0 if no signature chains were found to be blacklisted or to contain
- * unsupported crypto, or:
+ * (*) 0 if a signature chain passed verification, or:
*
* (*) -EKEYREJECTED if a blacklisted key was encountered, or:
*
@@ -423,8 +422,11 @@ int pkcs7_verify(struct pkcs7_message *pkcs7,
for (sinfo = pkcs7->signed_infos; sinfo; sinfo = sinfo->next) {
ret = pkcs7_verify_one(pkcs7, sinfo);
- if (sinfo->blacklisted && actual_ret == -ENOPKG)
- actual_ret = -EKEYREJECTED;
+ if (sinfo->blacklisted) {
+ if (actual_ret == -ENOPKG)
+ actual_ret = -EKEYREJECTED;
+ continue;
+ }
if (ret < 0) {
if (ret == -ENOPKG) {
sinfo->unsupported_crypto = true;
--
2.16.0.rc1.238.g530d649a79-goog
From: Eric Biggers <ebiggers(a)google.com>
When pkcs7_verify_sig_chain() is building the certificate chain for a
SignerInfo using the certificates in the PKCS#7 message, it is passing
the wrong arguments to public_key_verify_signature(). Consequently,
when the next certificate is supposed to be used to verify the previous
certificate, the next certificate is actually used to verify itself.
An attacker can use this bug to create a bogus certificate chain that
has no cryptographic relationship between the beginning and end.
Fortunately I couldn't quite find a way to use this to bypass the
overall signature verification, though it comes very close. Here's the
reasoning: due to the bug, every certificate in the chain beyond the
first actually has to be self-signed (where "self-signed" here refers to
the actual key and signature; an attacker might still manipulate the
certificate fields such that the self_signed flag doesn't actually get
set, and thus the chain doesn't end immediately). But to pass trust
validation (pkcs7_validate_trust()), either the SignerInfo or one of the
certificates has to actually be signed by a trusted key. Since only
self-signed certificates can be added to the chain, the only way for an
attacker to introduce a trusted signature is to include a self-signed
trusted certificate.
But, when pkcs7_validate_trust_one() reaches that certificate, instead
of trying to verify the signature on that certificate, it will actually
look up the corresponding trusted key, which will succeed, and then try
to verify the *previous* certificate, which will fail. Thus, disaster
is narrowly averted (as far as I could tell).
Fixes: 6c2dc5ae4ab7 ("X.509: Extract signature digest and make self-signed cert checks earlier")
Cc: <stable(a)vger.kernel.org> # v4.7+
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
crypto/asymmetric_keys/pkcs7_verify.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/crypto/asymmetric_keys/pkcs7_verify.c b/crypto/asymmetric_keys/pkcs7_verify.c
index 39e6de0c2761..2f6a768b91d7 100644
--- a/crypto/asymmetric_keys/pkcs7_verify.c
+++ b/crypto/asymmetric_keys/pkcs7_verify.c
@@ -270,7 +270,7 @@ static int pkcs7_verify_sig_chain(struct pkcs7_message *pkcs7,
sinfo->index);
return 0;
}
- ret = public_key_verify_signature(p->pub, p->sig);
+ ret = public_key_verify_signature(p->pub, x509->sig);
if (ret < 0)
return ret;
x509->signer = p;
--
2.16.0.rc1.238.g530d649a79-goog
From: Eric Biggers <ebiggers(a)google.com>
Subject: pipe: actually allow root to exceed the pipe buffer limits
pipe-user-pages-hard and pipe-user-pages-soft are only supposed to apply
to unprivileged users, as documented in both Documentation/sysctl/fs.txt
and the pipe(7) man page.
However, the capabilities are actually only checked when increasing a
pipe's size using F_SETPIPE_SZ, not when creating a new pipe. Therefore,
if pipe-user-pages-hard has been set, the root user can run into it and be
unable to create pipes. Similarly, if pipe-user-pages-soft has been set,
the root user can run into it and have their pipes limited to 1 page each.
Fix this by allowing the privileged override in both cases.
Link: http://lkml.kernel.org/r/20180111052902.14409-4-ebiggers3@gmail.com
Fixes: 759c01142a5d ("pipe: limit the per-user amount of pages allocated in pipes")
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Acked-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Joe Lawrence <joe.lawrence(a)redhat.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: "Luis R . Rodriguez" <mcgrof(a)kernel.org>
Cc: Michael Kerrisk <mtk.manpages(a)gmail.com>
Cc: Mikulas Patocka <mpatocka(a)redhat.com>
Cc: Willy Tarreau <w(a)1wt.eu>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/pipe.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff -puN fs/pipe.c~pipe-actually-allow-root-to-exceed-the-pipe-buffer-limits fs/pipe.c
--- a/fs/pipe.c~pipe-actually-allow-root-to-exceed-the-pipe-buffer-limits
+++ a/fs/pipe.c
@@ -613,6 +613,11 @@ static bool too_many_pipe_buffers_hard(u
return pipe_user_pages_hard && user_bufs >= pipe_user_pages_hard;
}
+static bool is_unprivileged_user(void)
+{
+ return !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN);
+}
+
struct pipe_inode_info *alloc_pipe_info(void)
{
struct pipe_inode_info *pipe;
@@ -629,12 +634,12 @@ struct pipe_inode_info *alloc_pipe_info(
user_bufs = account_pipe_buffers(user, 0, pipe_bufs);
- if (too_many_pipe_buffers_soft(user_bufs)) {
+ if (too_many_pipe_buffers_soft(user_bufs) && is_unprivileged_user()) {
user_bufs = account_pipe_buffers(user, pipe_bufs, 1);
pipe_bufs = 1;
}
- if (too_many_pipe_buffers_hard(user_bufs))
+ if (too_many_pipe_buffers_hard(user_bufs) && is_unprivileged_user())
goto out_revert_acct;
pipe->bufs = kcalloc(pipe_bufs, sizeof(struct pipe_buffer),
@@ -1065,7 +1070,7 @@ static long pipe_set_size(struct pipe_in
if (nr_pages > pipe->buffers &&
(too_many_pipe_buffers_hard(user_bufs) ||
too_many_pipe_buffers_soft(user_bufs)) &&
- !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) {
+ is_unprivileged_user()) {
ret = -EPERM;
goto out_revert_acct;
}
_
From: Arnd Bergmann <arnd(a)arndb.de>
Subject: kasan: rework Kconfig settings
We get a lot of very large stack frames using gcc-7.0.1 with the default
-fsanitize-address-use-after-scope --param asan-stack=1 options, which can
easily cause an overflow of the kernel stack, e.g.
drivers/gpu/drm/i915/gvt/handlers.c:2434:1: warning: the frame size of 46176 bytes is larger than 3072 bytes
drivers/net/wireless/ralink/rt2x00/rt2800lib.c:5650:1: warning: the frame size of 23632 bytes is larger than 3072 bytes
lib/atomic64_test.c:250:1: warning: the frame size of 11200 bytes is larger than 3072 bytes
drivers/gpu/drm/i915/gvt/handlers.c:2621:1: warning: the frame size of 9208 bytes is larger than 3072 bytes
drivers/media/dvb-frontends/stv090x.c:3431:1: warning: the frame size of 6816 bytes is larger than 3072 bytes
fs/fscache/stats.c:287:1: warning: the frame size of 6536 bytes is larger than 3072 bytes
To reduce this risk, -fsanitize-address-use-after-scope is now split out
into a separate CONFIG_KASAN_EXTRA Kconfig option, leading to stack frames
that are smaller than 2 kilobytes most of the time on x86_64. An earlier
version of this patch also prevented combining KASAN_EXTRA with
KASAN_INLINE, but that is no longer necessary with gcc-7.0.1.
All patches to get the frame size below 2048 bytes with CONFIG_KASAN=y and
CONFIG_KASAN_EXTRA=n have been merged by maintainers now, so we can bring
back that default now. KASAN_EXTRA=y still causes lots of warnings but
now defaults to !COMPILE_TEST to disable it in allmodconfig, and it
remains disabled in all other defconfigs since it is a new option. I
arbitrarily raise the warning limit for KASAN_EXTRA to 3072 to reduce the
noise, but an allmodconfig kernel still has around 50 warnings on gcc-7.
I experimented a bit more with smaller stack frames and have another
follow-up series that reduces the warning limit for 64-bit architectures
to 1280 bytes (without CONFIG_KASAN).
With earlier versions of this patch series, I also had patches to address
the warnings we get with KASAN and/or KASAN_EXTRA, using a
"noinline_if_stackbloat" annotation. That annotation now got replaced
with a gcc-8 bugfix (see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715) and a workaround for
older compilers, which means that KASAN_EXTRA is now just as bad as before
and will lead to an instant stack overflow in a few extreme cases.
This reverts parts of commit commit 3f181b4 ("lib/Kconfig.debug: disable
-Wframe-larger-than warnings with KASAN=y"). Two patches in linux-next
should be merged first to avoid introducing warnings in an allmodconfig
build:
3cd890dbe2a4 ("media: dvb-frontends: fix i2c access helpers for KASAN")
16c3ada89cff ("media: r820t: fix r820t_write_reg for KASAN")
Do we really need to backport this?
I think we do: without this patch, enabling KASAN will lead to
unavoidable kernel stack overflow in certain device drivers when built
with gcc-7 or higher on linux-4.10+ or any version that contains a
backport of commit c5caf21ab0cf8. Most people are probably still on
older compilers, but it will get worse over time as they upgrade their
distros.
The warnings we get on kernels older than this should all be for code
that uses dangerously large stack frames, though most of them do not
cause an actual stack overflow by themselves.The asan-stack option was
added in linux-4.0, and commit 3f181b4d8652 ("lib/Kconfig.debug:
disable -Wframe-larger-than warnings with KASAN=y") effectively turned
off the warning for allmodconfig kernels, so I would like to see this
fix backported to any kernels later than 4.0.
I have done dozens of fixes for individual functions with stack frames
larger than 2048 bytes with asan-stack, and I plan to make sure that
all those fixes make it into the stable kernels as well (most are
already there).
Part of the complication here is that asan-stack (from 4.0) was
originally assumed to always require much larger stacks, but that
turned out to be a combination of multiple gcc bugs that we have now
worked around and fixed, but sanitize-address-use-after-scope (from
v4.10) has a much higher inherent stack usage and also suffers from at
least three other problems that we have analyzed but not yet fixed
upstream, each of them makes the stack usage more severe than it should
be.
Link: http://lkml.kernel.org/r/20171221134744.2295529-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Acked-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/Kconfig.debug | 2 +-
lib/Kconfig.kasan | 11 +++++++++++
scripts/Makefile.kasan | 2 ++
3 files changed, 14 insertions(+), 1 deletion(-)
diff -puN lib/Kconfig.debug~kasan-rework-kconfig-settings lib/Kconfig.debug
--- a/lib/Kconfig.debug~kasan-rework-kconfig-settings
+++ a/lib/Kconfig.debug
@@ -217,7 +217,7 @@ config ENABLE_MUST_CHECK
config FRAME_WARN
int "Warn for stack frames larger than (needs gcc 4.4)"
range 0 8192
- default 0 if KASAN
+ default 3072 if KASAN_EXTRA
default 2048 if GCC_PLUGIN_LATENT_ENTROPY
default 1280 if (!64BIT && PARISC)
default 1024 if (!64BIT && !PARISC)
diff -puN lib/Kconfig.kasan~kasan-rework-kconfig-settings lib/Kconfig.kasan
--- a/lib/Kconfig.kasan~kasan-rework-kconfig-settings
+++ a/lib/Kconfig.kasan
@@ -20,6 +20,17 @@ config KASAN
Currently CONFIG_KASAN doesn't work with CONFIG_DEBUG_SLAB
(the resulting kernel does not boot).
+config KASAN_EXTRA
+ bool "KAsan: extra checks"
+ depends on KASAN && DEBUG_KERNEL && !COMPILE_TEST
+ help
+ This enables further checks in the kernel address sanitizer, for now
+ it only includes the address-use-after-scope check that can lead
+ to excessive kernel stack usage, frame size warnings and longer
+ compile time.
+ https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715 has more
+
+
choice
prompt "Instrumentation type"
depends on KASAN
diff -puN scripts/Makefile.kasan~kasan-rework-kconfig-settings scripts/Makefile.kasan
--- a/scripts/Makefile.kasan~kasan-rework-kconfig-settings
+++ a/scripts/Makefile.kasan
@@ -38,7 +38,9 @@ else
endif
+ifdef CONFIG_KASAN_EXTRA
CFLAGS_KASAN += $(call cc-option, -fsanitize-address-use-after-scope)
+endif
CFLAGS_KASAN_NOSANITIZE := -fno-builtin
_