- Linux-stable-mirror - lists.linaro.org

[Linux-stable-mirror] Patch "af_key: fix buffer overread in verify_address_len()" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled af_key: fix buffer overread in verify_address_len() to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: af_key-fix-buffer-overread-in-verify_address_len.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 06b335cb51af018d5feeff5dd4fd53847ddb675a Mon Sep 17 00:00:00 2001 From: Eric Biggers <ebiggers(a)google.com> Date: Fri, 29 Dec 2017 18:13:05 -0600 Subject: af_key: fix buffer overread in verify_address_len() From: Eric Biggers <ebiggers(a)google.com> commit 06b335cb51af018d5feeff5dd4fd53847ddb675a upstream. If a message sent to a PF_KEY socket ended with one of the extensions that takes a 'struct sadb_address' but there were not enough bytes remaining in the message for the ->sa_family member of the 'struct sockaddr' which is supposed to follow, then verify_address_len() read past the end of the message, into uninitialized memory. Fix it by returning -EINVAL in this case. This bug was found using syzkaller with KMSAN. Reproducer: #include <linux/pfkeyv2.h> #include <sys/socket.h> #include <unistd.h> int main() { int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2); char buf[24] = { 0 }; struct sadb_msg *msg = (void *)buf; struct sadb_address *addr = (void *)(msg + 1); msg->sadb_msg_version = PF_KEY_V2; msg->sadb_msg_type = SADB_DELETE; msg->sadb_msg_len = 3; addr->sadb_address_len = 1; addr->sadb_address_exttype = SADB_EXT_ADDRESS_SRC; write(sock, buf, 24); } Reported-by: Alexander Potapenko <glider(a)google.com> Signed-off-by: Eric Biggers <ebiggers(a)google.com> Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- net/key/af_key.c | 5 +++++ 1 file changed, 5 insertions(+) --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -401,6 +401,11 @@ static int verify_address_len(const void #endif int len; + if (sp->sadb_address_len < + DIV_ROUND_UP(sizeof(*sp) + offsetofend(typeof(*addr), sa_family), + sizeof(uint64_t))) + return -EINVAL; + switch (addr->sa_family) { case AF_INET: len = DIV_ROUND_UP(sizeof(*sp) + sizeof(*sin), sizeof(uint64_t)); Patches currently in stable-queue which might be from ebiggers(a)google.com are queue-4.4/af_key-fix-buffer-overread-in-parse_exthdrs.patch queue-4.4/af_key-fix-buffer-overread-in-verify_address_len.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "af_key: fix buffer overread in parse_exthdrs()" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled af_key: fix buffer overread in parse_exthdrs() to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: af_key-fix-buffer-overread-in-parse_exthdrs.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 4e765b4972af7b07adcb1feb16e7a525ce1f6b28 Mon Sep 17 00:00:00 2001 From: Eric Biggers <ebiggers(a)google.com> Date: Fri, 29 Dec 2017 18:15:23 -0600 Subject: af_key: fix buffer overread in parse_exthdrs() From: Eric Biggers <ebiggers(a)google.com> commit 4e765b4972af7b07adcb1feb16e7a525ce1f6b28 upstream. If a message sent to a PF_KEY socket ended with an incomplete extension header (fewer than 4 bytes remaining), then parse_exthdrs() read past the end of the message, into uninitialized memory. Fix it by returning -EINVAL in this case. Reproducer: #include <linux/pfkeyv2.h> #include <sys/socket.h> #include <unistd.h> int main() { int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2); char buf[17] = { 0 }; struct sadb_msg *msg = (void *)buf; msg->sadb_msg_version = PF_KEY_V2; msg->sadb_msg_type = SADB_DELETE; msg->sadb_msg_len = 2; write(sock, buf, 17); } Signed-off-by: Eric Biggers <ebiggers(a)google.com> Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- net/key/af_key.c | 3 +++ 1 file changed, 3 insertions(+) --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -516,6 +516,9 @@ static int parse_exthdrs(struct sk_buff uint16_t ext_type; int ext_len; + if (len < sizeof(*ehdr)) + return -EINVAL; + ext_len = ehdr->sadb_ext_len; ext_len *= sizeof(uint64_t); ext_type = ehdr->sadb_ext_type; Patches currently in stable-queue which might be from ebiggers(a)google.com are queue-4.4/af_key-fix-buffer-overread-in-parse_exthdrs.patch queue-4.4/af_key-fix-buffer-overread-in-verify_address_len.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "timers: Unconditionally check deferrable base" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled timers: Unconditionally check deferrable base to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: timers-unconditionally-check-deferrable-base.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From ed4bbf7910b28ce3c691aef28d245585eaabda06 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner <tglx(a)linutronix.de> Date: Sun, 14 Jan 2018 23:19:49 +0100 Subject: timers: Unconditionally check deferrable base From: Thomas Gleixner <tglx(a)linutronix.de> commit ed4bbf7910b28ce3c691aef28d245585eaabda06 upstream. When the timer base is checked for expired timers then the deferrable base must be checked as well. This was missed when making the deferrable base independent of base::nohz_active. Fixes: ced6d5c11d3e ("timers: Use deferrable base independent of base::nohz_active") Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Anna-Maria Gleixner <anna-maria(a)linutronix.de> Cc: Frederic Weisbecker <fweisbec(a)gmail.com> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Sebastian Siewior <bigeasy(a)linutronix.de> Cc: Paul McKenney <paulmck(a)linux.vnet.ibm.com> Cc: rt(a)linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/time/timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1656,7 +1656,7 @@ void run_local_timers(void) hrtimer_run_queues(); /* Raise the softirq only if required. */ if (time_before(jiffies, base->clk)) { - if (!IS_ENABLED(CONFIG_NO_HZ_COMMON) || !base->nohz_active) + if (!IS_ENABLED(CONFIG_NO_HZ_COMMON)) return; /* CPU is awake, so check the deferrable base. */ base++; Patches currently in stable-queue which might be from tglx(a)linutronix.de are queue-4.14/futex-prevent-overflow-by-strengthen-input-validation.patch queue-4.14/objtool-fix-clang-enum-conversion-warning.patch queue-4.14/timers-unconditionally-check-deferrable-base.patch queue-4.14/futex-avoid-violating-the-10th-rule-of-futex.patch queue-4.14/delayacct-account-blkio-completion-on-the-correct-task.patch queue-4.14/objtool-fix-seg-fault-with-clang-compiled-objects.patch queue-4.14/objtool-fix-seg-fault-caused-by-missing-parameter.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "RDMA/mlx5: Fix out-of-bound access while querying AH" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled RDMA/mlx5: Fix out-of-bound access while querying AH to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: rdma-mlx5-fix-out-of-bound-access-while-querying-ah.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From ae59c3f0b6cfd472fed96e50548a799b8971d876 Mon Sep 17 00:00:00 2001 From: Leon Romanovsky <leonro(a)mellanox.com> Date: Fri, 12 Jan 2018 07:58:39 +0200 Subject: RDMA/mlx5: Fix out-of-bound access while querying AH From: Leon Romanovsky <leonro(a)mellanox.com> commit ae59c3f0b6cfd472fed96e50548a799b8971d876 upstream. The rdma_ah_find_type() accesses the port array based on an index controlled by userspace. The existing bounds check is after the first use of the index, so userspace can generate an out of bounds access, as shown by the KASN report below. ================================================================== BUG: KASAN: slab-out-of-bounds in to_rdma_ah_attr+0xa8/0x3b0 Read of size 4 at addr ffff880019ae2268 by task ibv_rc_pingpong/409 CPU: 0 PID: 409 Comm: ibv_rc_pingpong Not tainted 4.15.0-rc2-00031-gb60a3faf5b83-dirty #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0xe9/0x18f print_address_description+0xa2/0x350 kasan_report+0x3a5/0x400 to_rdma_ah_attr+0xa8/0x3b0 mlx5_ib_query_qp+0xd35/0x1330 ib_query_qp+0x8a/0xb0 ib_uverbs_query_qp+0x237/0x7f0 ib_uverbs_write+0x617/0xd80 __vfs_write+0xf7/0x500 vfs_write+0x149/0x310 SyS_write+0xca/0x190 entry_SYSCALL_64_fastpath+0x18/0x85 RIP: 0033:0x7fe9c7a275a0 RSP: 002b:00007ffee5498738 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fe9c7ce4b00 RCX: 00007fe9c7a275a0 RDX: 0000000000000018 RSI: 00007ffee5498800 RDI: 0000000000000003 RBP: 000055d0c8d3f010 R08: 00007ffee5498800 R09: 0000000000000018 R10: 00000000000000ba R11: 0000000000000246 R12: 0000000000008000 R13: 0000000000004fb0 R14: 000055d0c8d3f050 R15: 00007ffee5498560 Allocated by task 1: __kmalloc+0x3f9/0x430 alloc_mad_private+0x25/0x50 ib_mad_post_receive_mads+0x204/0xa60 ib_mad_init_device+0xa59/0x1020 ib_register_device+0x83a/0xbc0 mlx5_ib_add+0x50e/0x5c0 mlx5_add_device+0x142/0x410 mlx5_register_interface+0x18f/0x210 mlx5_ib_init+0x56/0x63 do_one_initcall+0x15b/0x270 kernel_init_freeable+0x2d8/0x3d0 kernel_init+0x14/0x190 ret_from_fork+0x24/0x30 Freed by task 0: (stack is not available) The buggy address belongs to the object at ffff880019ae2000 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 104 bytes to the right of 512-byte region [ffff880019ae2000, ffff880019ae2200) The buggy address belongs to the page: page:000000005d674e18 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 flags: 0x4000000000008100(slab|head) raw: 4000000000008100 0000000000000000 0000000000000000 00000001000c000c raw: dead000000000100 dead000000000200 ffff88001a402000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880019ae2100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880019ae2180: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc >ffff880019ae2200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff880019ae2280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880019ae2300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Disabling lock debugging due to kernel taint Fixes: 44c58487d51a ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types") Signed-off-by: Leon Romanovsky <leonro(a)mellanox.com> Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/infiniband/hw/mlx5/qp.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -4303,12 +4303,11 @@ static void to_rdma_ah_attr(struct mlx5_ memset(ah_attr, 0, sizeof(*ah_attr)); - ah_attr->type = rdma_ah_find_type(&ibdev->ib_dev, path->port); - rdma_ah_set_port_num(ah_attr, path->port); - if (rdma_ah_get_port_num(ah_attr) == 0 || - rdma_ah_get_port_num(ah_attr) > MLX5_CAP_GEN(dev, num_ports)) + if (!path->port || path->port > MLX5_CAP_GEN(dev, num_ports)) return; + ah_attr->type = rdma_ah_find_type(&ibdev->ib_dev, path->port); + rdma_ah_set_port_num(ah_attr, path->port); rdma_ah_set_sl(ah_attr, path->dci_cfi_prio_sl & 0xf); Patches currently in stable-queue which might be from leonro(a)mellanox.com are queue-4.14/rdma-mlx5-fix-out-of-bound-access-while-querying-ah.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "iser-target: Fix possible use-after-free in connection establishment error" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled iser-target: Fix possible use-after-free in connection establishment error to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: iser-target-fix-possible-use-after-free-in-connection-establishment-error.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From cd52cb26e7ead5093635e98e07e221e4df482d34 Mon Sep 17 00:00:00 2001 From: Sagi Grimberg <sagi(a)grimberg.me> Date: Sun, 26 Nov 2017 15:31:04 +0200 Subject: iser-target: Fix possible use-after-free in connection establishment error From: Sagi Grimberg <sagi(a)grimberg.me> commit cd52cb26e7ead5093635e98e07e221e4df482d34 upstream. In case we fail to establish the connection we must drain our pre-posted login recieve work request before continuing safely with connection teardown. Fixes: a060b5629ab0 ("IB/core: generic RDMA READ/WRITE API") Reported-by: Amrani, Ram <Ram.Amrani(a)cavium.com> Signed-off-by: Sagi Grimberg <sagi(a)grimberg.me> Signed-off-by: Doug Ledford <dledford(a)redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/infiniband/ulp/isert/ib_isert.c | 1 + 1 file changed, 1 insertion(+) --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -741,6 +741,7 @@ isert_connect_error(struct rdma_cm_id *c { struct isert_conn *isert_conn = cma_id->qp->qp_context; + ib_drain_qp(isert_conn->qp); list_del_init(&isert_conn->node); isert_conn->cm_id = NULL; isert_put_conn(isert_conn); Patches currently in stable-queue which might be from sagi(a)grimberg.me are queue-4.14/iser-target-fix-possible-use-after-free-in-connection-establishment-error.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "futex: Prevent overflow by strengthen input validation" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled futex: Prevent overflow by strengthen input validation to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: futex-prevent-overflow-by-strengthen-input-validation.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From fbe0e839d1e22d88810f3ee3e2f1479be4c0aa4a Mon Sep 17 00:00:00 2001 From: Li Jinyue <lijinyue(a)huawei.com> Date: Thu, 14 Dec 2017 17:04:54 +0800 Subject: futex: Prevent overflow by strengthen input validation From: Li Jinyue <lijinyue(a)huawei.com> commit fbe0e839d1e22d88810f3ee3e2f1479be4c0aa4a upstream. UBSAN reports signed integer overflow in kernel/futex.c: UBSAN: Undefined behaviour in kernel/futex.c:2041:18 signed integer overflow: 0 - -2147483648 cannot be represented in type 'int' Add a sanity check to catch negative values of nr_wake and nr_requeue. Signed-off-by: Li Jinyue <lijinyue(a)huawei.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: peterz(a)infradead.org Cc: dvhart(a)infradead.org Link: https://lkml.kernel.org/r/1513242294-31786-1-git-send-email-lijinyue@huawei… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/futex.c | 3 +++ 1 file changed, 3 insertions(+) --- a/kernel/futex.c +++ b/kernel/futex.c @@ -1878,6 +1878,9 @@ static int futex_requeue(u32 __user *uad struct futex_q *this, *next; DEFINE_WAKE_Q(wake_q); + if (nr_wake < 0 || nr_requeue < 0) + return -EINVAL; + /* * When PI not supported: return -ENOSYS if requeue_pi is true, * consequently the compiler knows requeue_pi is always false past Patches currently in stable-queue which might be from lijinyue(a)huawei.com are queue-4.14/futex-prevent-overflow-by-strengthen-input-validation.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "IB/hfi1: Prevent a NULL dereference" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled IB/hfi1: Prevent a NULL dereference to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: ib-hfi1-prevent-a-null-dereference.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 57194fa763bfa1a0908f30d4c77835beaa118fcb Mon Sep 17 00:00:00 2001 From: Dan Carpenter <dan.carpenter(a)oracle.com> Date: Tue, 9 Jan 2018 23:03:46 +0300 Subject: IB/hfi1: Prevent a NULL dereference From: Dan Carpenter <dan.carpenter(a)oracle.com> commit 57194fa763bfa1a0908f30d4c77835beaa118fcb upstream. In the original code, we set "fd->uctxt" to NULL and then dereference it which will cause an Oops. Fixes: f2a3bc00a03c ("IB/hfi1: Protect context array set/clear with spinlock") Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com> Signed-off-by: Doug Ledford <dledford(a)redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/infiniband/hw/hfi1/file_ops.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/infiniband/hw/hfi1/file_ops.c +++ b/drivers/infiniband/hw/hfi1/file_ops.c @@ -881,11 +881,11 @@ static int complete_subctxt(struct hfi1_ } if (ret) { - hfi1_rcd_put(fd->uctxt); - fd->uctxt = NULL; spin_lock_irqsave(&fd->dd->uctxt_lock, flags); __clear_bit(fd->subctxt, fd->uctxt->in_use_ctxts); spin_unlock_irqrestore(&fd->dd->uctxt_lock, flags); + hfi1_rcd_put(fd->uctxt); + fd->uctxt = NULL; } return ret; Patches currently in stable-queue which might be from dan.carpenter(a)oracle.com are queue-4.14/ib-hfi1-prevent-a-null-dereference.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "futex: Avoid violating the 10th rule of futex" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled futex: Avoid violating the 10th rule of futex to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: futex-avoid-violating-the-10th-rule-of-futex.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From c1e2f0eaf015fb7076d51a339011f2383e6dd389 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra <peterz(a)infradead.org> Date: Fri, 8 Dec 2017 13:49:39 +0100 Subject: futex: Avoid violating the 10th rule of futex From: Peter Zijlstra <peterz(a)infradead.org> commit c1e2f0eaf015fb7076d51a339011f2383e6dd389 upstream. Julia reported futex state corruption in the following scenario: waiter waker stealer (prio > waiter) futex(WAIT_REQUEUE_PI, uaddr, uaddr2, timeout=[N ms]) futex_wait_requeue_pi() futex_wait_queue_me() freezable_schedule() <scheduled out> futex(LOCK_PI, uaddr2) futex(CMP_REQUEUE_PI, uaddr, uaddr2, 1, 0) /* requeues waiter to uaddr2 */ futex(UNLOCK_PI, uaddr2) wake_futex_pi() cmp_futex_value_locked(uaddr2, waiter) wake_up_q() <woken by waker> <hrtimer_wakeup() fires, clears sleeper->task> futex(LOCK_PI, uaddr2) __rt_mutex_start_proxy_lock() try_to_take_rt_mutex() /* steals lock */ rt_mutex_set_owner(lock, stealer) <preempted> <scheduled in> rt_mutex_wait_proxy_lock() __rt_mutex_slowlock() try_to_take_rt_mutex() /* fails, lock held by stealer */ if (timeout && !timeout->task) return -ETIMEDOUT; fixup_owner() /* lock wasn't acquired, so, fixup_pi_state_owner skipped */ return -ETIMEDOUT; /* At this point, we've returned -ETIMEDOUT to userspace, but the * futex word shows waiter to be the owner, and the pi_mutex has * stealer as the owner */ futex_lock(LOCK_PI, uaddr2) -> bails with EDEADLK, futex word says we're owner. And suggested that what commit: 73d786bd043e ("futex: Rework inconsistent rt_mutex/futex_q state") removes from fixup_owner() looks to be just what is needed. And indeed it is -- I completely missed that requeue_pi could also result in this case. So we need to restore that, except that subsequent patches, like commit: 16ffa12d7425 ("futex: Pull rt_mutex_futex_unlock() out from under hb->lock") changed all the locking rules. Even without that, the sequence: - if (rt_mutex_futex_trylock(&q->pi_state->pi_mutex)) { - locked = 1; - goto out; - } - raw_spin_lock_irq(&q->pi_state->pi_mutex.wait_lock); - owner = rt_mutex_owner(&q->pi_state->pi_mutex); - if (!owner) - owner = rt_mutex_next_owner(&q->pi_state->pi_mutex); - raw_spin_unlock_irq(&q->pi_state->pi_mutex.wait_lock); - ret = fixup_pi_state_owner(uaddr, q, owner); already suggests there were races; otherwise we'd never have to look at next_owner. So instead of doing 3 consecutive wait_lock sections with who knows what races, we do it all in a single section. Additionally, the usage of pi_state->owner in fixup_owner() was only safe because only the rt_mutex owner would modify it, which this additional case wrecks. Luckily the values can only change away and not to the value we're testing, this means we can do a speculative test and double check once we have the wait_lock. Fixes: 73d786bd043e ("futex: Rework inconsistent rt_mutex/futex_q state") Reported-by: Julia Cartwright <julia(a)ni.com> Reported-by: Gratian Crisan <gratian.crisan(a)ni.com> Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Tested-by: Julia Cartwright <julia(a)ni.com> Tested-by: Gratian Crisan <gratian.crisan(a)ni.com> Cc: Darren Hart <dvhart(a)infradead.org> Link: https://lkml.kernel.org/r/20171208124939.7livp7no2ov65rrc@hirez.programming… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/futex.c | 83 ++++++++++++++++++++++++++++++++-------- kernel/locking/rtmutex.c | 26 +++++++++--- kernel/locking/rtmutex_common.h | 1 3 files changed, 87 insertions(+), 23 deletions(-) --- a/kernel/futex.c +++ b/kernel/futex.c @@ -2294,21 +2294,17 @@ static void unqueue_me_pi(struct futex_q spin_unlock(q->lock_ptr); } -/* - * Fixup the pi_state owner with the new owner. - * - * Must be called with hash bucket lock held and mm->sem held for non - * private futexes. - */ static int fixup_pi_state_owner(u32 __user *uaddr, struct futex_q *q, - struct task_struct *newowner) + struct task_struct *argowner) { - u32 newtid = task_pid_vnr(newowner) | FUTEX_WAITERS; struct futex_pi_state *pi_state = q->pi_state; u32 uval, uninitialized_var(curval), newval; - struct task_struct *oldowner; + struct task_struct *oldowner, *newowner; + u32 newtid; int ret; + lockdep_assert_held(q->lock_ptr); + raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock); oldowner = pi_state->owner; @@ -2317,11 +2313,17 @@ static int fixup_pi_state_owner(u32 __us newtid |= FUTEX_OWNER_DIED; /* - * We are here either because we stole the rtmutex from the - * previous highest priority waiter or we are the highest priority - * waiter but have failed to get the rtmutex the first time. + * We are here because either: + * + * - we stole the lock and pi_state->owner needs updating to reflect + * that (@argowner == current), + * + * or: * - * We have to replace the newowner TID in the user space variable. + * - someone stole our lock and we need to fix things to point to the + * new owner (@argowner == NULL). + * + * Either way, we have to replace the TID in the user space variable. * This must be atomic as we have to preserve the owner died bit here. * * Note: We write the user space value _before_ changing the pi_state @@ -2334,6 +2336,42 @@ static int fixup_pi_state_owner(u32 __us * in the PID check in lookup_pi_state. */ retry: + if (!argowner) { + if (oldowner != current) { + /* + * We raced against a concurrent self; things are + * already fixed up. Nothing to do. + */ + ret = 0; + goto out_unlock; + } + + if (__rt_mutex_futex_trylock(&pi_state->pi_mutex)) { + /* We got the lock after all, nothing to fix. */ + ret = 0; + goto out_unlock; + } + + /* + * Since we just failed the trylock; there must be an owner. + */ + newowner = rt_mutex_owner(&pi_state->pi_mutex); + BUG_ON(!newowner); + } else { + WARN_ON_ONCE(argowner != current); + if (oldowner == current) { + /* + * We raced against a concurrent self; things are + * already fixed up. Nothing to do. + */ + ret = 0; + goto out_unlock; + } + newowner = argowner; + } + + newtid = task_pid_vnr(newowner) | FUTEX_WAITERS; + if (get_futex_value_locked(&uval, uaddr)) goto handle_fault; @@ -2434,15 +2472,28 @@ static int fixup_owner(u32 __user *uaddr * Got the lock. We might not be the anticipated owner if we * did a lock-steal - fix up the PI-state in that case: * - * We can safely read pi_state->owner without holding wait_lock - * because we now own the rt_mutex, only the owner will attempt - * to change it. + * Speculative pi_state->owner read (we don't hold wait_lock); + * since we own the lock pi_state->owner == current is the + * stable state, anything else needs more attention. */ if (q->pi_state->owner != current) ret = fixup_pi_state_owner(uaddr, q, current); goto out; } + /* + * If we didn't get the lock; check if anybody stole it from us. In + * that case, we need to fix up the uval to point to them instead of + * us, otherwise bad things happen. [10] + * + * Another speculative read; pi_state->owner == current is unstable + * but needs our attention. + */ + if (q->pi_state->owner == current) { + ret = fixup_pi_state_owner(uaddr, q, NULL); + goto out; + } + /* * Paranoia check. If we did not take the lock, then we should not be * the owner of the rt_mutex. --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -1290,6 +1290,19 @@ rt_mutex_slowlock(struct rt_mutex *lock, return ret; } +static inline int __rt_mutex_slowtrylock(struct rt_mutex *lock) +{ + int ret = try_to_take_rt_mutex(lock, current, NULL); + + /* + * try_to_take_rt_mutex() sets the lock waiters bit + * unconditionally. Clean this up. + */ + fixup_rt_mutex_waiters(lock); + + return ret; +} + /* * Slow path try-lock function: */ @@ -1312,13 +1325,7 @@ static inline int rt_mutex_slowtrylock(s */ raw_spin_lock_irqsave(&lock->wait_lock, flags); - ret = try_to_take_rt_mutex(lock, current, NULL); - - /* - * try_to_take_rt_mutex() sets the lock waiters bit - * unconditionally. Clean this up. - */ - fixup_rt_mutex_waiters(lock); + ret = __rt_mutex_slowtrylock(lock); raw_spin_unlock_irqrestore(&lock->wait_lock, flags); @@ -1505,6 +1512,11 @@ int __sched rt_mutex_futex_trylock(struc return rt_mutex_slowtrylock(lock); } +int __sched __rt_mutex_futex_trylock(struct rt_mutex *lock) +{ + return __rt_mutex_slowtrylock(lock); +} + /** * rt_mutex_timed_lock - lock a rt_mutex interruptible * the timeout structure is provided --- a/kernel/locking/rtmutex_common.h +++ b/kernel/locking/rtmutex_common.h @@ -148,6 +148,7 @@ extern bool rt_mutex_cleanup_proxy_lock( struct rt_mutex_waiter *waiter); extern int rt_mutex_futex_trylock(struct rt_mutex *l); +extern int __rt_mutex_futex_trylock(struct rt_mutex *l); extern void rt_mutex_futex_unlock(struct rt_mutex *lock); extern bool __rt_mutex_futex_unlock(struct rt_mutex *lock, Patches currently in stable-queue which might be from peterz(a)infradead.org are queue-4.14/futex-prevent-overflow-by-strengthen-input-validation.patch queue-4.14/objtool-fix-clang-enum-conversion-warning.patch queue-4.14/timers-unconditionally-check-deferrable-base.patch queue-4.14/futex-avoid-violating-the-10th-rule-of-futex.patch queue-4.14/delayacct-account-blkio-completion-on-the-correct-task.patch queue-4.14/objtool-fix-seg-fault-with-clang-compiled-objects.patch queue-4.14/objtool-fix-seg-fault-caused-by-missing-parameter.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "delayacct: Account blkio completion on the correct task" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled delayacct: Account blkio completion on the correct task to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: delayacct-account-blkio-completion-on-the-correct-task.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From c96f5471ce7d2aefd0dda560cc23f08ab00bc65d Mon Sep 17 00:00:00 2001 From: Josh Snyder <joshs(a)netflix.com> Date: Mon, 18 Dec 2017 16:15:10 +0000 Subject: delayacct: Account blkio completion on the correct task From: Josh Snyder <joshs(a)netflix.com> commit c96f5471ce7d2aefd0dda560cc23f08ab00bc65d upstream. Before commit: e33a9bba85a8 ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler") delayacct_blkio_end() was called after context-switching into the task which completed I/O. This resulted in double counting: the task would account a delay both waiting for I/O and for time spent in the runqueue. With e33a9bba85a8, delayacct_blkio_end() is called by try_to_wake_up(). In ttwu, we have not yet context-switched. This is more correct, in that the delay accounting ends when the I/O is complete. But delayacct_blkio_end() relies on 'get_current()', and we have not yet context-switched into the task whose I/O completed. This results in the wrong task having its delay accounting statistics updated. Instead of doing that, pass the task_struct being woken to delayacct_blkio_end(), so that it can update the statistics of the correct task. Signed-off-by: Josh Snyder <joshs(a)netflix.com> Acked-by: Tejun Heo <tj(a)kernel.org> Acked-by: Balbir Singh <bsingharora(a)gmail.com> Cc: Brendan Gregg <bgregg(a)netflix.com> Cc: Jens Axboe <axboe(a)kernel.dk> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: linux-block(a)vger.kernel.org Fixes: e33a9bba85a8 ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler") Link: http://lkml.kernel.org/r/1513613712-571-1-git-send-email-joshs@netflix.com Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- include/linux/delayacct.h | 8 ++++---- kernel/delayacct.c | 42 ++++++++++++++++++++++++++---------------- kernel/sched/core.c | 6 +++--- 3 files changed, 33 insertions(+), 23 deletions(-) --- a/include/linux/delayacct.h +++ b/include/linux/delayacct.h @@ -71,7 +71,7 @@ extern void delayacct_init(void); extern void __delayacct_tsk_init(struct task_struct *); extern void __delayacct_tsk_exit(struct task_struct *); extern void __delayacct_blkio_start(void); -extern void __delayacct_blkio_end(void); +extern void __delayacct_blkio_end(struct task_struct *); extern int __delayacct_add_tsk(struct taskstats *, struct task_struct *); extern __u64 __delayacct_blkio_ticks(struct task_struct *); extern void __delayacct_freepages_start(void); @@ -122,10 +122,10 @@ static inline void delayacct_blkio_start __delayacct_blkio_start(); } -static inline void delayacct_blkio_end(void) +static inline void delayacct_blkio_end(struct task_struct *p) { if (current->delays) - __delayacct_blkio_end(); + __delayacct_blkio_end(p); delayacct_clear_flag(DELAYACCT_PF_BLKIO); } @@ -169,7 +169,7 @@ static inline void delayacct_tsk_free(st {} static inline void delayacct_blkio_start(void) {} -static inline void delayacct_blkio_end(void) +static inline void delayacct_blkio_end(struct task_struct *p) {} static inline int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk) --- a/kernel/delayacct.c +++ b/kernel/delayacct.c @@ -51,16 +51,16 @@ void __delayacct_tsk_init(struct task_st * Finish delay accounting for a statistic using its timestamps (@start), * accumalator (@total) and @count */ -static void delayacct_end(u64 *start, u64 *total, u32 *count) +static void delayacct_end(spinlock_t *lock, u64 *start, u64 *total, u32 *count) { s64 ns = ktime_get_ns() - *start; unsigned long flags; if (ns > 0) { - spin_lock_irqsave(&current->delays->lock, flags); + spin_lock_irqsave(lock, flags); *total += ns; (*count)++; - spin_unlock_irqrestore(&current->delays->lock, flags); + spin_unlock_irqrestore(lock, flags); } } @@ -69,17 +69,25 @@ void __delayacct_blkio_start(void) current->delays->blkio_start = ktime_get_ns(); } -void __delayacct_blkio_end(void) +/* + * We cannot rely on the `current` macro, as we haven't yet switched back to + * the process being woken. + */ +void __delayacct_blkio_end(struct task_struct *p) { - if (current->delays->flags & DELAYACCT_PF_SWAPIN) - /* Swapin block I/O */ - delayacct_end(&current->delays->blkio_start, - &current->delays->swapin_delay, - &current->delays->swapin_count); - else /* Other block I/O */ - delayacct_end(&current->delays->blkio_start, - &current->delays->blkio_delay, - &current->delays->blkio_count); + struct task_delay_info *delays = p->delays; + u64 *total; + u32 *count; + + if (p->delays->flags & DELAYACCT_PF_SWAPIN) { + total = &delays->swapin_delay; + count = &delays->swapin_count; + } else { + total = &delays->blkio_delay; + count = &delays->blkio_count; + } + + delayacct_end(&delays->lock, &delays->blkio_start, total, count); } int __delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk) @@ -153,8 +161,10 @@ void __delayacct_freepages_start(void) void __delayacct_freepages_end(void) { - delayacct_end(&current->delays->freepages_start, - &current->delays->freepages_delay, - &current->delays->freepages_count); + delayacct_end( + &current->delays->lock, + &current->delays->freepages_start, + &current->delays->freepages_delay, + &current->delays->freepages_count); } --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2046,7 +2046,7 @@ try_to_wake_up(struct task_struct *p, un p->state = TASK_WAKING; if (p->in_iowait) { - delayacct_blkio_end(); + delayacct_blkio_end(p); atomic_dec(&task_rq(p)->nr_iowait); } @@ -2059,7 +2059,7 @@ try_to_wake_up(struct task_struct *p, un #else /* CONFIG_SMP */ if (p->in_iowait) { - delayacct_blkio_end(); + delayacct_blkio_end(p); atomic_dec(&task_rq(p)->nr_iowait); } @@ -2112,7 +2112,7 @@ static void try_to_wake_up_local(struct if (!task_on_rq_queued(p)) { if (p->in_iowait) { - delayacct_blkio_end(); + delayacct_blkio_end(p); atomic_dec(&rq->nr_iowait); } ttwu_activate(rq, p, ENQUEUE_WAKEUP | ENQUEUE_NOCLOCK); Patches currently in stable-queue which might be from joshs(a)netflix.com are queue-4.14/delayacct-account-blkio-completion-on-the-correct-task.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "ALSA: seq: Make ioctls race-free" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled ALSA: seq: Make ioctls race-free to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: alsa-seq-make-ioctls-race-free.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From b3defb791b26ea0683a93a4f49c77ec45ec96f10 Mon Sep 17 00:00:00 2001 From: Takashi Iwai <tiwai(a)suse.de> Date: Tue, 9 Jan 2018 23:11:03 +0100 Subject: ALSA: seq: Make ioctls race-free From: Takashi Iwai <tiwai(a)suse.de> commit b3defb791b26ea0683a93a4f49c77ec45ec96f10 upstream. The ALSA sequencer ioctls have no protection against racy calls while the concurrent operations may lead to interfere with each other. As reported recently, for example, the concurrent calls of setting client pool with a combination of write calls may lead to either the unkillable dead-lock or UAF. As a slightly big hammer solution, this patch introduces the mutex to make each ioctl exclusive. Although this may reduce performance via parallel ioctl calls, usually it's not demanded for sequencer usages, hence it should be negligible. Reported-by: Luo Quan <a4651386(a)163.com> Reviewed-by: Kees Cook <keescook(a)chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Takashi Iwai <tiwai(a)suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- sound/core/seq/seq_clientmgr.c | 3 +++ sound/core/seq/seq_clientmgr.h | 1 + 2 files changed, 4 insertions(+) --- a/sound/core/seq/seq_clientmgr.c +++ b/sound/core/seq/seq_clientmgr.c @@ -221,6 +221,7 @@ static struct snd_seq_client *seq_create rwlock_init(&client->ports_lock); mutex_init(&client->ports_mutex); INIT_LIST_HEAD(&client->ports_list_head); + mutex_init(&client->ioctl_mutex); /* find free slot in the client table */ spin_lock_irqsave(&clients_lock, flags); @@ -2126,7 +2127,9 @@ static long snd_seq_ioctl(struct file *f return -EFAULT; } + mutex_lock(&client->ioctl_mutex); err = handler->func(client, &buf); + mutex_unlock(&client->ioctl_mutex); if (err >= 0) { /* Some commands includes a bug in 'dir' field. */ if (handler->cmd == SNDRV_SEQ_IOCTL_SET_QUEUE_CLIENT || --- a/sound/core/seq/seq_clientmgr.h +++ b/sound/core/seq/seq_clientmgr.h @@ -61,6 +61,7 @@ struct snd_seq_client { struct list_head ports_list_head; rwlock_t ports_lock; struct mutex ports_mutex; + struct mutex ioctl_mutex; int convert32; /* convert 32->64bit */ /* output pool */ Patches currently in stable-queue which might be from tiwai(a)suse.de are queue-4.14/alsa-pcm-remove-yet-superfluous-warn_on.patch queue-4.14/alsa-hda-apply-the-existing-quirk-to-imac-14-1.patch queue-4.14/alsa-hda-apply-headphone-noise-quirk-for-another-dell-xps-13-variant.patch queue-4.14/alsa-seq-make-ioctls-race-free.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "ALSA: pcm: Remove yet superfluous WARN_ON()" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled ALSA: pcm: Remove yet superfluous WARN_ON() to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: alsa-pcm-remove-yet-superfluous-warn_on.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 23b19b7b50fe1867da8d431eea9cd3e4b6328c2c Mon Sep 17 00:00:00 2001 From: Takashi Iwai <tiwai(a)suse.de> Date: Wed, 10 Jan 2018 23:48:05 +0100 Subject: ALSA: pcm: Remove yet superfluous WARN_ON() From: Takashi Iwai <tiwai(a)suse.de> commit 23b19b7b50fe1867da8d431eea9cd3e4b6328c2c upstream. muldiv32() contains a snd_BUG_ON() (which is morphed as WARN_ON() with debug option) for checking the case of 0 / 0. This would be helpful if this happens only as a logical error; however, since the hw refine is performed with any data set provided by user, the inconsistent values that can trigger such a condition might be passed easily. Actually, syzbot caught this by passing some zero'ed old hw_params ioctl. So, having snd_BUG_ON() there is simply superfluous and rather harmful to give unnecessary confusions. Let's get rid of it. Reported-by: syzbot+7e6ee55011deeebce15d(a)syzkaller.appspotmail.com Signed-off-by: Takashi Iwai <tiwai(a)suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- sound/core/pcm_lib.c | 1 - 1 file changed, 1 deletion(-) --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -560,7 +560,6 @@ static inline unsigned int muldiv32(unsi { u_int64_t n = (u_int64_t) a * b; if (c == 0) { - snd_BUG_ON(!n); *r = 0; return UINT_MAX; } Patches currently in stable-queue which might be from tiwai(a)suse.de are queue-4.14/alsa-pcm-remove-yet-superfluous-warn_on.patch queue-4.14/alsa-hda-apply-the-existing-quirk-to-imac-14-1.patch queue-4.14/alsa-hda-apply-headphone-noise-quirk-for-another-dell-xps-13-variant.patch queue-4.14/alsa-seq-make-ioctls-race-free.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "ALSA: hda - Apply the existing quirk to iMac 14, 1" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled ALSA: hda - Apply the existing quirk to iMac 14,1 to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: alsa-hda-apply-the-existing-quirk-to-imac-14-1.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 031f335cda879450095873003abb03ae8ed3b74a Mon Sep 17 00:00:00 2001 From: Takashi Iwai <tiwai(a)suse.de> Date: Wed, 10 Jan 2018 10:53:18 +0100 Subject: ALSA: hda - Apply the existing quirk to iMac 14,1 From: Takashi Iwai <tiwai(a)suse.de> commit 031f335cda879450095873003abb03ae8ed3b74a upstream. iMac 14,1 requires the same quirk as iMac 12,2, using GPIO 2 and 3 for headphone and speaker output amps. Add the codec SSID quirk entry (106b:0600) accordingly. BugLink: http://lkml.kernel.org/r/CAEw6Zyteav09VGHRfD5QwsfuWv5a43r0tFBNbfcHXoNrxVz7e… Reported-by: Freaky <freaky2000(a)gmail.com> Signed-off-by: Takashi Iwai <tiwai(a)suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- sound/pci/hda/patch_cirrus.c | 1 + 1 file changed, 1 insertion(+) --- a/sound/pci/hda/patch_cirrus.c +++ b/sound/pci/hda/patch_cirrus.c @@ -408,6 +408,7 @@ static const struct snd_pci_quirk cs420x /*SND_PCI_QUIRK(0x8086, 0x7270, "IMac 27 Inch", CS420X_IMAC27),*/ /* codec SSID */ + SND_PCI_QUIRK(0x106b, 0x0600, "iMac 14,1", CS420X_IMAC27_122), SND_PCI_QUIRK(0x106b, 0x1c00, "MacBookPro 8,1", CS420X_MBP81), SND_PCI_QUIRK(0x106b, 0x2000, "iMac 12,2", CS420X_IMAC27_122), SND_PCI_QUIRK(0x106b, 0x2800, "MacBookPro 10,1", CS420X_MBP101), Patches currently in stable-queue which might be from tiwai(a)suse.de are queue-4.14/alsa-pcm-remove-yet-superfluous-warn_on.patch queue-4.14/alsa-hda-apply-the-existing-quirk-to-imac-14-1.patch queue-4.14/alsa-hda-apply-headphone-noise-quirk-for-another-dell-xps-13-variant.patch queue-4.14/alsa-seq-make-ioctls-race-free.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "ALSA: hda - Apply headphone noise quirk for another Dell XPS 13 variant" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled ALSA: hda - Apply headphone noise quirk for another Dell XPS 13 variant to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: alsa-hda-apply-headphone-noise-quirk-for-another-dell-xps-13-variant.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From e4c9fd10eb21376f44723c40ad12395089251c28 Mon Sep 17 00:00:00 2001 From: Takashi Iwai <tiwai(a)suse.de> Date: Wed, 10 Jan 2018 08:34:28 +0100 Subject: ALSA: hda - Apply headphone noise quirk for another Dell XPS 13 variant From: Takashi Iwai <tiwai(a)suse.de> commit e4c9fd10eb21376f44723c40ad12395089251c28 upstream. There is another Dell XPS 13 variant (SSID 1028:082a) that requires the existing fixup for reducing the headphone noise. This patch adds the quirk entry for that. BugLink: http://lkml.kernel.org/r/CAHXyb9ZCZJzVisuBARa+UORcjRERV8yokez=DP1_5O5isTz0Z… Reported-and-tested-by: Francisco G. <frangio.1(a)gmail.com> Signed-off-by: Takashi Iwai <tiwai(a)suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+) --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -6173,6 +6173,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x1028, 0x075b, "Dell XPS 13 9360", ALC256_FIXUP_DELL_XPS_13_HEADPHONE_NOISE), SND_PCI_QUIRK(0x1028, 0x075d, "Dell AIO", ALC298_FIXUP_SPK_VOLUME), SND_PCI_QUIRK(0x1028, 0x0798, "Dell Inspiron 17 7000 Gaming", ALC256_FIXUP_DELL_INSPIRON_7559_SUBWOOFER), + SND_PCI_QUIRK(0x1028, 0x082a, "Dell XPS 13 9360", ALC256_FIXUP_DELL_XPS_13_HEADPHONE_NOISE), SND_PCI_QUIRK(0x1028, 0x164a, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1028, 0x164b, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x1586, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC2), Patches currently in stable-queue which might be from tiwai(a)suse.de are queue-4.14/alsa-pcm-remove-yet-superfluous-warn_on.patch queue-4.14/alsa-hda-apply-the-existing-quirk-to-imac-14-1.patch queue-4.14/alsa-hda-apply-headphone-noise-quirk-for-another-dell-xps-13-variant.patch queue-4.14/alsa-seq-make-ioctls-race-free.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "af_key: fix buffer overread in verify_address_len()" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled af_key: fix buffer overread in verify_address_len() to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: af_key-fix-buffer-overread-in-verify_address_len.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 06b335cb51af018d5feeff5dd4fd53847ddb675a Mon Sep 17 00:00:00 2001 From: Eric Biggers <ebiggers(a)google.com> Date: Fri, 29 Dec 2017 18:13:05 -0600 Subject: af_key: fix buffer overread in verify_address_len() From: Eric Biggers <ebiggers(a)google.com> commit 06b335cb51af018d5feeff5dd4fd53847ddb675a upstream. If a message sent to a PF_KEY socket ended with one of the extensions that takes a 'struct sadb_address' but there were not enough bytes remaining in the message for the ->sa_family member of the 'struct sockaddr' which is supposed to follow, then verify_address_len() read past the end of the message, into uninitialized memory. Fix it by returning -EINVAL in this case. This bug was found using syzkaller with KMSAN. Reproducer: #include <linux/pfkeyv2.h> #include <sys/socket.h> #include <unistd.h> int main() { int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2); char buf[24] = { 0 }; struct sadb_msg *msg = (void *)buf; struct sadb_address *addr = (void *)(msg + 1); msg->sadb_msg_version = PF_KEY_V2; msg->sadb_msg_type = SADB_DELETE; msg->sadb_msg_len = 3; addr->sadb_address_len = 1; addr->sadb_address_exttype = SADB_EXT_ADDRESS_SRC; write(sock, buf, 24); } Reported-by: Alexander Potapenko <glider(a)google.com> Signed-off-by: Eric Biggers <ebiggers(a)google.com> Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- net/key/af_key.c | 5 +++++ 1 file changed, 5 insertions(+) --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -401,6 +401,11 @@ static int verify_address_len(const void #endif int len; + if (sp->sadb_address_len < + DIV_ROUND_UP(sizeof(*sp) + offsetofend(typeof(*addr), sa_family), + sizeof(uint64_t))) + return -EINVAL; + switch (addr->sa_family) { case AF_INET: len = DIV_ROUND_UP(sizeof(*sp) + sizeof(*sin), sizeof(uint64_t)); Patches currently in stable-queue which might be from ebiggers(a)google.com are queue-4.14/af_key-fix-buffer-overread-in-parse_exthdrs.patch queue-4.14/af_key-fix-buffer-overread-in-verify_address_len.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "af_key: fix buffer overread in parse_exthdrs()" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled af_key: fix buffer overread in parse_exthdrs() to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: af_key-fix-buffer-overread-in-parse_exthdrs.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 4e765b4972af7b07adcb1feb16e7a525ce1f6b28 Mon Sep 17 00:00:00 2001 From: Eric Biggers <ebiggers(a)google.com> Date: Fri, 29 Dec 2017 18:15:23 -0600 Subject: af_key: fix buffer overread in parse_exthdrs() From: Eric Biggers <ebiggers(a)google.com> commit 4e765b4972af7b07adcb1feb16e7a525ce1f6b28 upstream. If a message sent to a PF_KEY socket ended with an incomplete extension header (fewer than 4 bytes remaining), then parse_exthdrs() read past the end of the message, into uninitialized memory. Fix it by returning -EINVAL in this case. Reproducer: #include <linux/pfkeyv2.h> #include <sys/socket.h> #include <unistd.h> int main() { int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2); char buf[17] = { 0 }; struct sadb_msg *msg = (void *)buf; msg->sadb_msg_version = PF_KEY_V2; msg->sadb_msg_type = SADB_DELETE; msg->sadb_msg_len = 2; write(sock, buf, 17); } Signed-off-by: Eric Biggers <ebiggers(a)google.com> Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- net/key/af_key.c | 3 +++ 1 file changed, 3 insertions(+) --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -516,6 +516,9 @@ static int parse_exthdrs(struct sk_buff uint16_t ext_type; int ext_len; + if (len < sizeof(*ehdr)) + return -EINVAL; + ext_len = ehdr->sadb_ext_len; ext_len *= sizeof(uint64_t); ext_type = ehdr->sadb_ext_type; Patches currently in stable-queue which might be from ebiggers(a)google.com are queue-4.14/af_key-fix-buffer-overread-in-parse_exthdrs.patch queue-4.14/af_key-fix-buffer-overread-in-verify_address_len.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "futex: Prevent overflow by strengthen input validation" has been added to the 3.18-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled futex: Prevent overflow by strengthen input validation to the 3.18-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: futex-prevent-overflow-by-strengthen-input-validation.patch and it can be found in the queue-3.18 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From fbe0e839d1e22d88810f3ee3e2f1479be4c0aa4a Mon Sep 17 00:00:00 2001 From: Li Jinyue <lijinyue(a)huawei.com> Date: Thu, 14 Dec 2017 17:04:54 +0800 Subject: futex: Prevent overflow by strengthen input validation From: Li Jinyue <lijinyue(a)huawei.com> commit fbe0e839d1e22d88810f3ee3e2f1479be4c0aa4a upstream. UBSAN reports signed integer overflow in kernel/futex.c: UBSAN: Undefined behaviour in kernel/futex.c:2041:18 signed integer overflow: 0 - -2147483648 cannot be represented in type 'int' Add a sanity check to catch negative values of nr_wake and nr_requeue. Signed-off-by: Li Jinyue <lijinyue(a)huawei.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: peterz(a)infradead.org Cc: dvhart(a)infradead.org Link: https://lkml.kernel.org/r/1513242294-31786-1-git-send-email-lijinyue@huawei… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/futex.c | 3 +++ 1 file changed, 3 insertions(+) --- a/kernel/futex.c +++ b/kernel/futex.c @@ -1514,6 +1514,9 @@ static int futex_requeue(u32 __user *uad struct futex_hash_bucket *hb1, *hb2; struct futex_q *this, *next; + if (nr_wake < 0 || nr_requeue < 0) + return -EINVAL; + if (requeue_pi) { /* * Requeue PI only works on two distinct uaddrs. This Patches currently in stable-queue which might be from lijinyue(a)huawei.com are queue-3.18/futex-prevent-overflow-by-strengthen-input-validation.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "ALSA: pcm: Remove yet superfluous WARN_ON()" has been added to the 3.18-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled ALSA: pcm: Remove yet superfluous WARN_ON() to the 3.18-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: alsa-pcm-remove-yet-superfluous-warn_on.patch and it can be found in the queue-3.18 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 23b19b7b50fe1867da8d431eea9cd3e4b6328c2c Mon Sep 17 00:00:00 2001 From: Takashi Iwai <tiwai(a)suse.de> Date: Wed, 10 Jan 2018 23:48:05 +0100 Subject: ALSA: pcm: Remove yet superfluous WARN_ON() From: Takashi Iwai <tiwai(a)suse.de> commit 23b19b7b50fe1867da8d431eea9cd3e4b6328c2c upstream. muldiv32() contains a snd_BUG_ON() (which is morphed as WARN_ON() with debug option) for checking the case of 0 / 0. This would be helpful if this happens only as a logical error; however, since the hw refine is performed with any data set provided by user, the inconsistent values that can trigger such a condition might be passed easily. Actually, syzbot caught this by passing some zero'ed old hw_params ioctl. So, having snd_BUG_ON() there is simply superfluous and rather harmful to give unnecessary confusions. Let's get rid of it. Reported-by: syzbot+7e6ee55011deeebce15d(a)syzkaller.appspotmail.com Signed-off-by: Takashi Iwai <tiwai(a)suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- sound/core/pcm_lib.c | 1 - 1 file changed, 1 deletion(-) --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -644,7 +644,6 @@ static inline unsigned int muldiv32(unsi { u_int64_t n = (u_int64_t) a * b; if (c == 0) { - snd_BUG_ON(!n); *r = 0; return UINT_MAX; } Patches currently in stable-queue which might be from tiwai(a)suse.de are queue-3.18/alsa-pcm-remove-yet-superfluous-warn_on.patch queue-3.18/alsa-hda-apply-the-existing-quirk-to-imac-14-1.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "af_key: fix buffer overread in verify_address_len()" has been added to the 3.18-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled af_key: fix buffer overread in verify_address_len() to the 3.18-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: af_key-fix-buffer-overread-in-verify_address_len.patch and it can be found in the queue-3.18 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 06b335cb51af018d5feeff5dd4fd53847ddb675a Mon Sep 17 00:00:00 2001 From: Eric Biggers <ebiggers(a)google.com> Date: Fri, 29 Dec 2017 18:13:05 -0600 Subject: af_key: fix buffer overread in verify_address_len() From: Eric Biggers <ebiggers(a)google.com> commit 06b335cb51af018d5feeff5dd4fd53847ddb675a upstream. If a message sent to a PF_KEY socket ended with one of the extensions that takes a 'struct sadb_address' but there were not enough bytes remaining in the message for the ->sa_family member of the 'struct sockaddr' which is supposed to follow, then verify_address_len() read past the end of the message, into uninitialized memory. Fix it by returning -EINVAL in this case. This bug was found using syzkaller with KMSAN. Reproducer: #include <linux/pfkeyv2.h> #include <sys/socket.h> #include <unistd.h> int main() { int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2); char buf[24] = { 0 }; struct sadb_msg *msg = (void *)buf; struct sadb_address *addr = (void *)(msg + 1); msg->sadb_msg_version = PF_KEY_V2; msg->sadb_msg_type = SADB_DELETE; msg->sadb_msg_len = 3; addr->sadb_address_len = 1; addr->sadb_address_exttype = SADB_EXT_ADDRESS_SRC; write(sock, buf, 24); } Reported-by: Alexander Potapenko <glider(a)google.com> Signed-off-by: Eric Biggers <ebiggers(a)google.com> Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- net/key/af_key.c | 5 +++++ 1 file changed, 5 insertions(+) --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -401,6 +401,11 @@ static int verify_address_len(const void #endif int len; + if (sp->sadb_address_len < + DIV_ROUND_UP(sizeof(*sp) + offsetofend(typeof(*addr), sa_family), + sizeof(uint64_t))) + return -EINVAL; + switch (addr->sa_family) { case AF_INET: len = DIV_ROUND_UP(sizeof(*sp) + sizeof(*sin), sizeof(uint64_t)); Patches currently in stable-queue which might be from ebiggers(a)google.com are queue-3.18/af_key-fix-buffer-overread-in-parse_exthdrs.patch queue-3.18/af_key-fix-buffer-overread-in-verify_address_len.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "ALSA: hda - Apply the existing quirk to iMac 14, 1" has been added to the 3.18-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled ALSA: hda - Apply the existing quirk to iMac 14,1 to the 3.18-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: alsa-hda-apply-the-existing-quirk-to-imac-14-1.patch and it can be found in the queue-3.18 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 031f335cda879450095873003abb03ae8ed3b74a Mon Sep 17 00:00:00 2001 From: Takashi Iwai <tiwai(a)suse.de> Date: Wed, 10 Jan 2018 10:53:18 +0100 Subject: ALSA: hda - Apply the existing quirk to iMac 14,1 From: Takashi Iwai <tiwai(a)suse.de> commit 031f335cda879450095873003abb03ae8ed3b74a upstream. iMac 14,1 requires the same quirk as iMac 12,2, using GPIO 2 and 3 for headphone and speaker output amps. Add the codec SSID quirk entry (106b:0600) accordingly. BugLink: http://lkml.kernel.org/r/CAEw6Zyteav09VGHRfD5QwsfuWv5a43r0tFBNbfcHXoNrxVz7e… Reported-by: Freaky <freaky2000(a)gmail.com> Signed-off-by: Takashi Iwai <tiwai(a)suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- sound/pci/hda/patch_cirrus.c | 1 + 1 file changed, 1 insertion(+) --- a/sound/pci/hda/patch_cirrus.c +++ b/sound/pci/hda/patch_cirrus.c @@ -394,6 +394,7 @@ static const struct snd_pci_quirk cs420x /*SND_PCI_QUIRK(0x8086, 0x7270, "IMac 27 Inch", CS420X_IMAC27),*/ /* codec SSID */ + SND_PCI_QUIRK(0x106b, 0x0600, "iMac 14,1", CS420X_IMAC27_122), SND_PCI_QUIRK(0x106b, 0x1c00, "MacBookPro 8,1", CS420X_MBP81), SND_PCI_QUIRK(0x106b, 0x2000, "iMac 12,2", CS420X_IMAC27_122), SND_PCI_QUIRK(0x106b, 0x2800, "MacBookPro 10,1", CS420X_MBP101), Patches currently in stable-queue which might be from tiwai(a)suse.de are queue-3.18/alsa-pcm-remove-yet-superfluous-warn_on.patch queue-3.18/alsa-hda-apply-the-existing-quirk-to-imac-14-1.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "af_key: fix buffer overread in parse_exthdrs()" has been added to the 3.18-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled af_key: fix buffer overread in parse_exthdrs() to the 3.18-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: af_key-fix-buffer-overread-in-parse_exthdrs.patch and it can be found in the queue-3.18 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 4e765b4972af7b07adcb1feb16e7a525ce1f6b28 Mon Sep 17 00:00:00 2001 From: Eric Biggers <ebiggers(a)google.com> Date: Fri, 29 Dec 2017 18:15:23 -0600 Subject: af_key: fix buffer overread in parse_exthdrs() From: Eric Biggers <ebiggers(a)google.com> commit 4e765b4972af7b07adcb1feb16e7a525ce1f6b28 upstream. If a message sent to a PF_KEY socket ended with an incomplete extension header (fewer than 4 bytes remaining), then parse_exthdrs() read past the end of the message, into uninitialized memory. Fix it by returning -EINVAL in this case. Reproducer: #include <linux/pfkeyv2.h> #include <sys/socket.h> #include <unistd.h> int main() { int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2); char buf[17] = { 0 }; struct sadb_msg *msg = (void *)buf; msg->sadb_msg_version = PF_KEY_V2; msg->sadb_msg_type = SADB_DELETE; msg->sadb_msg_len = 2; write(sock, buf, 17); } Signed-off-by: Eric Biggers <ebiggers(a)google.com> Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- net/key/af_key.c | 3 +++ 1 file changed, 3 insertions(+) --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -516,6 +516,9 @@ static int parse_exthdrs(struct sk_buff uint16_t ext_type; int ext_len; + if (len < sizeof(*ehdr)) + return -EINVAL; + ext_len = ehdr->sadb_ext_len; ext_len *= sizeof(uint64_t); ext_type = ehdr->sadb_ext_type; Patches currently in stable-queue which might be from ebiggers(a)google.com are queue-3.18/af_key-fix-buffer-overread-in-parse_exthdrs.patch queue-3.18/af_key-fix-buffer-overread-in-verify_address_len.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] FAILED: patch "[PATCH] ALSA: seq: Make ioctls race-free" failed to apply to 3.18-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 3.18-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From b3defb791b26ea0683a93a4f49c77ec45ec96f10 Mon Sep 17 00:00:00 2001 From: Takashi Iwai <tiwai(a)suse.de> Date: Tue, 9 Jan 2018 23:11:03 +0100 Subject: [PATCH] ALSA: seq: Make ioctls race-free The ALSA sequencer ioctls have no protection against racy calls while the concurrent operations may lead to interfere with each other. As reported recently, for example, the concurrent calls of setting client pool with a combination of write calls may lead to either the unkillable dead-lock or UAF. As a slightly big hammer solution, this patch introduces the mutex to make each ioctl exclusive. Although this may reduce performance via parallel ioctl calls, usually it's not demanded for sequencer usages, hence it should be negligible. Reported-by: Luo Quan <a4651386(a)163.com> Reviewed-by: Kees Cook <keescook(a)chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai(a)suse.de> diff --git a/sound/core/seq/seq_clientmgr.c b/sound/core/seq/seq_clientmgr.c index 6e22eea72654..d01913404581 100644 --- a/sound/core/seq/seq_clientmgr.c +++ b/sound/core/seq/seq_clientmgr.c @@ -221,6 +221,7 @@ static struct snd_seq_client *seq_create_client1(int client_index, int poolsize) rwlock_init(&client->ports_lock); mutex_init(&client->ports_mutex); INIT_LIST_HEAD(&client->ports_list_head); + mutex_init(&client->ioctl_mutex); /* find free slot in the client table */ spin_lock_irqsave(&clients_lock, flags); @@ -2130,7 +2131,9 @@ static long snd_seq_ioctl(struct file *file, unsigned int cmd, return -EFAULT; } + mutex_lock(&client->ioctl_mutex); err = handler->func(client, &buf); + mutex_unlock(&client->ioctl_mutex); if (err >= 0) { /* Some commands includes a bug in 'dir' field. */ if (handler->cmd == SNDRV_SEQ_IOCTL_SET_QUEUE_CLIENT || diff --git a/sound/core/seq/seq_clientmgr.h b/sound/core/seq/seq_clientmgr.h index c6614254ef8a..0611e1e0ed5b 100644 --- a/sound/core/seq/seq_clientmgr.h +++ b/sound/core/seq/seq_clientmgr.h @@ -61,6 +61,7 @@ struct snd_seq_client { struct list_head ports_list_head; rwlock_t ports_lock; struct mutex ports_mutex; + struct mutex ioctl_mutex; int convert32; /* convert 32->64bit */ /* output pool */

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] FAILED: patch "[PATCH] ALSA: seq: Make ioctls race-free" failed to apply to 4.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From b3defb791b26ea0683a93a4f49c77ec45ec96f10 Mon Sep 17 00:00:00 2001 From: Takashi Iwai <tiwai(a)suse.de> Date: Tue, 9 Jan 2018 23:11:03 +0100 Subject: [PATCH] ALSA: seq: Make ioctls race-free The ALSA sequencer ioctls have no protection against racy calls while the concurrent operations may lead to interfere with each other. As reported recently, for example, the concurrent calls of setting client pool with a combination of write calls may lead to either the unkillable dead-lock or UAF. As a slightly big hammer solution, this patch introduces the mutex to make each ioctl exclusive. Although this may reduce performance via parallel ioctl calls, usually it's not demanded for sequencer usages, hence it should be negligible. Reported-by: Luo Quan <a4651386(a)163.com> Reviewed-by: Kees Cook <keescook(a)chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai(a)suse.de> diff --git a/sound/core/seq/seq_clientmgr.c b/sound/core/seq/seq_clientmgr.c index 6e22eea72654..d01913404581 100644 --- a/sound/core/seq/seq_clientmgr.c +++ b/sound/core/seq/seq_clientmgr.c @@ -221,6 +221,7 @@ static struct snd_seq_client *seq_create_client1(int client_index, int poolsize) rwlock_init(&client->ports_lock); mutex_init(&client->ports_mutex); INIT_LIST_HEAD(&client->ports_list_head); + mutex_init(&client->ioctl_mutex); /* find free slot in the client table */ spin_lock_irqsave(&clients_lock, flags); @@ -2130,7 +2131,9 @@ static long snd_seq_ioctl(struct file *file, unsigned int cmd, return -EFAULT; } + mutex_lock(&client->ioctl_mutex); err = handler->func(client, &buf); + mutex_unlock(&client->ioctl_mutex); if (err >= 0) { /* Some commands includes a bug in 'dir' field. */ if (handler->cmd == SNDRV_SEQ_IOCTL_SET_QUEUE_CLIENT || diff --git a/sound/core/seq/seq_clientmgr.h b/sound/core/seq/seq_clientmgr.h index c6614254ef8a..0611e1e0ed5b 100644 --- a/sound/core/seq/seq_clientmgr.h +++ b/sound/core/seq/seq_clientmgr.h @@ -61,6 +61,7 @@ struct snd_seq_client { struct list_head ports_list_head; rwlock_t ports_lock; struct mutex ports_mutex; + struct mutex ioctl_mutex; int convert32; /* convert 32->64bit */ /* output pool */

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH] UBI: block: Fix locking for idr_alloc/idr_remove

by Bradley Bolen

From: Bradley Bolen <bradleybolen(a)gmail.com> This fixes a race with idr_alloc where gd->first_minor can be set to the same value for two simultaneous calls to ubiblock_create. Each instance calls device_add_disk with the same first_minor. device_add_disk calls bdi_register_owner which generates several warnings. WARNING: CPU: 1 PID: 179 at kernel-source/fs/sysfs/dir.c:31 sysfs_warn_dup+0x68/0x88 sysfs: cannot create duplicate filename '/devices/virtual/bdi/252:2' WARNING: CPU: 1 PID: 179 at kernel-source/lib/kobject.c:240 kobject_add_internal+0x1ec/0x2f8 kobject_add_internal failed for 252:2 with -EEXIST, don't try to register things with the same name in the same directory WARNING: CPU: 1 PID: 179 at kernel-source/fs/sysfs/dir.c:31 sysfs_warn_dup+0x68/0x88 sysfs: cannot create duplicate filename '/dev/block/252:2' However, device_add_disk does not error out when bdi_register_owner returns an error. Control continues until reaching blk_register_queue. It then BUGs. kernel BUG at kernel-source/fs/sysfs/group.c:113! [<c01e26cc>] (internal_create_group) from [<c01e2950>] (sysfs_create_group+0x20/0x24) [<c01e2950>] (sysfs_create_group) from [<c00e3d38>] (blk_trace_init_sysfs+0x18/0x20) [<c00e3d38>] (blk_trace_init_sysfs) from [<c02bdfbc>] (blk_register_queue+0xd8/0x154) [<c02bdfbc>] (blk_register_queue) from [<c02cec84>] (device_add_disk+0x194/0x44c) [<c02cec84>] (device_add_disk) from [<c0436ec8>] (ubiblock_create+0x284/0x2e0) [<c0436ec8>] (ubiblock_create) from [<c0427bb8>] (vol_cdev_ioctl+0x450/0x554) [<c0427bb8>] (vol_cdev_ioctl) from [<c0189110>] (vfs_ioctl+0x30/0x44) [<c0189110>] (vfs_ioctl) from [<c01892e0>] (do_vfs_ioctl+0xa0/0x790) [<c01892e0>] (do_vfs_ioctl) from [<c0189a14>] (SyS_ioctl+0x44/0x68) [<c0189a14>] (SyS_ioctl) from [<c0010640>] (ret_fast_syscall+0x0/0x34) Locking idr_alloc/idr_remove removes the race and keeps gd->first_minor unique. Fixes: 2bf50d42f3a4 ("UBI: block: Dynamically allocate minor numbers") Cc: stable(a)vger.kernel.org Signed-off-by: Bradley Bolen <bradleybolen(a)gmail.com> --- drivers/mtd/ubi/block.c | 42 ++++++++++++++++++++++++++---------------- 1 file changed, 26 insertions(+), 16 deletions(-) diff --git a/drivers/mtd/ubi/block.c b/drivers/mtd/ubi/block.c index b210fdb..b1fc28f 100644 --- a/drivers/mtd/ubi/block.c +++ b/drivers/mtd/ubi/block.c @@ -99,6 +99,8 @@ struct ubiblock { /* Linked list of all ubiblock instances */ static LIST_HEAD(ubiblock_devices); +static DEFINE_IDR(ubiblock_minor_idr); +/* Protects ubiblock_devices and ubiblock_minor_idr */ static DEFINE_MUTEX(devices_mutex); static int ubiblock_major; @@ -351,8 +353,6 @@ static int ubiblock_init_request(struct blk_mq_tag_set *set, .init_request = ubiblock_init_request, }; -static DEFINE_IDR(ubiblock_minor_idr); - int ubiblock_create(struct ubi_volume_info *vi) { struct ubiblock *dev; @@ -365,14 +365,15 @@ int ubiblock_create(struct ubi_volume_info *vi) /* Check that the volume isn't already handled */ mutex_lock(&devices_mutex); if (find_dev_nolock(vi->ubi_num, vi->vol_id)) { - mutex_unlock(&devices_mutex); - return -EEXIST; + ret = -EEXIST; + goto out_unlock; } - mutex_unlock(&devices_mutex); dev = kzalloc(sizeof(struct ubiblock), GFP_KERNEL); - if (!dev) - return -ENOMEM; + if (!dev) { + ret = -ENOMEM; + goto out_unlock; + } mutex_init(&dev->dev_mutex); @@ -437,14 +438,13 @@ int ubiblock_create(struct ubi_volume_info *vi) goto out_free_queue; } - mutex_lock(&devices_mutex); list_add_tail(&dev->list, &ubiblock_devices); - mutex_unlock(&devices_mutex); /* Must be the last step: anyone can call file ops from now on */ add_disk(dev->gd); dev_info(disk_to_dev(dev->gd), "created from ubi%d:%d(%s)", dev->ubi_num, dev->vol_id, vi->name); + mutex_unlock(&devices_mutex); return 0; out_free_queue: @@ -457,6 +457,8 @@ int ubiblock_create(struct ubi_volume_info *vi) put_disk(dev->gd); out_free_dev: kfree(dev); +out_unlock: + mutex_unlock(&devices_mutex); return ret; } @@ -478,30 +480,36 @@ static void ubiblock_cleanup(struct ubiblock *dev) int ubiblock_remove(struct ubi_volume_info *vi) { struct ubiblock *dev; + int ret; mutex_lock(&devices_mutex); dev = find_dev_nolock(vi->ubi_num, vi->vol_id); if (!dev) { - mutex_unlock(&devices_mutex); - return -ENODEV; + ret = -ENODEV; + goto out_unlock; } /* Found a device, let's lock it so we can check if it's busy */ mutex_lock(&dev->dev_mutex); if (dev->refcnt > 0) { - mutex_unlock(&dev->dev_mutex); - mutex_unlock(&devices_mutex); - return -EBUSY; + ret = -EBUSY; + goto out_unlock_dev; } /* Remove from device list */ list_del(&dev->list); - mutex_unlock(&devices_mutex); - ubiblock_cleanup(dev); mutex_unlock(&dev->dev_mutex); + mutex_unlock(&devices_mutex); + kfree(dev); return 0; + +out_unlock_dev: + mutex_unlock(&dev->dev_mutex); +out_unlock: + mutex_unlock(&devices_mutex); + return ret; } static int ubiblock_resize(struct ubi_volume_info *vi) @@ -630,6 +638,7 @@ static void ubiblock_remove_all(void) struct ubiblock *next; struct ubiblock *dev; + mutex_lock(&devices_mutex); list_for_each_entry_safe(dev, next, &ubiblock_devices, list) { /* The module is being forcefully removed */ WARN_ON(dev->desc); @@ -638,6 +647,7 @@ static void ubiblock_remove_all(void) ubiblock_cleanup(dev); kfree(dev); } + mutex_unlock(&devices_mutex); } int __init ubiblock_init(void) -- 1.9.3

7 years, 8 months

3
2
0 0

[Linux-stable-mirror] [PATCH] drm/ast: Load lut in crtc_commit

by Daniel Vetter

In the past the ast driver relied upon the fbdev emulation helpers to call ->load_lut at boot-up. But since commit b8e2b0199cc377617dc238f5106352c06dcd3fa2 Author: Peter Rosin <peda(a)axentia.se> Date: Tue Jul 4 12:36:57 2017 +0200 drm/fb-helper: factor out pseudo-palette that's cleaned up and drivers are expected to boot into a consistent lut state. This patch fixes that. Fixes: b8e2b0199cc3 ("drm/fb-helper: factor out pseudo-palette") Cc: Peter Rosin <peda(a)axenita.se> Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch> Cc: <stable(a)vger.kernel.org> # v4.14+ Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=198123 Cc: Bill Fraser <bill.fraser(a)gmail.com> Reported-and-Tested-by: Bill Fraser <bill.fraser(a)gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com> --- drivers/gpu/drm/ast/ast_mode.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c index 9555a3542022..831b73392d82 100644 --- a/drivers/gpu/drm/ast/ast_mode.c +++ b/drivers/gpu/drm/ast/ast_mode.c @@ -644,6 +644,7 @@ static void ast_crtc_commit(struct drm_crtc *crtc) { struct ast_private *ast = crtc->dev->dev_private; ast_set_index_reg_mask(ast, AST_IO_SEQ_PORT, 0x1, 0xdf, 0); + ast_crtc_load_lut(crtc); } -- 2.15.1

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH v2] drm/vc4: Fix NULL pointer dereference in vc4_save_hang_state()

by Boris Brezillon

When saving BOs in the hang state we skip one entry of the kernel_state->bo[] array, thus leaving it to NULL. This leads to a NULL pointer dereference when, later in this function, we iterate over all BOs to check their ->madv state. Fixes: ca26d28bbaa3 ("drm/vc4: improve throughput by pipelining binning and rendering jobs") Cc: <stable(a)vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon(a)free-electrons.com> --- Changes in v2: - Get rid of prev_idx an replace it by k which is indepently incremented every time a new object is added to kernel_state->bo[]. - Add a WARN_ON_ONCE() when final value of k is inconsistent --- drivers/gpu/drm/vc4/vc4_gem.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c index 6c32c89a83a9..3216f12052fe 100644 --- a/drivers/gpu/drm/vc4/vc4_gem.c +++ b/drivers/gpu/drm/vc4/vc4_gem.c @@ -146,7 +146,7 @@ vc4_save_hang_state(struct drm_device *dev) struct vc4_exec_info *exec[2]; struct vc4_bo *bo; unsigned long irqflags; - unsigned int i, j, unref_list_count, prev_idx; + unsigned int i, j, k, unref_list_count; kernel_state = kcalloc(1, sizeof(*kernel_state), GFP_KERNEL); if (!kernel_state) @@ -182,7 +182,7 @@ vc4_save_hang_state(struct drm_device *dev) return; } - prev_idx = 0; + k = 0; for (i = 0; i < 2; i++) { if (!exec[i]) continue; @@ -197,7 +197,7 @@ vc4_save_hang_state(struct drm_device *dev) WARN_ON(!refcount_read(&bo->usecnt)); refcount_inc(&bo->usecnt); drm_gem_object_get(&exec[i]->bo[j]->base); - kernel_state->bo[j + prev_idx] = &exec[i]->bo[j]->base; + kernel_state->bo[k++] = &exec[i]->bo[j]->base; } list_for_each_entry(bo, &exec[i]->unref_list, unref_head) { @@ -205,12 +205,12 @@ vc4_save_hang_state(struct drm_device *dev) * because they are naturally unpurgeable. */ drm_gem_object_get(&bo->base.base); - kernel_state->bo[j + prev_idx] = &bo->base.base; - j++; + kernel_state->bo[k++] = &bo->base.base; } - prev_idx = j + 1; } + WARN_ON_ONCE(k != state->bo_count); + if (exec[0]) state->start_bin = exec[0]->ct0ca; if (exec[1]) -- 2.11.0

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH] UBI: block: Fix locking for idr_alloc/idr_remove

by Bradley Bolen

From: Bradley Bolen <bradleybolen(a)gmail.com> This fixes a race with idr_alloc where gd->first_minor can be set to the same value for two simultaneous calls to ubiblock_create. Each instance calls device_add_disk with the same first_minor. device_add_disk calls bdi_register_owner which generates several warnings. WARNING: CPU: 1 PID: 179 at kernel-source/fs/sysfs/dir.c:31 sysfs_warn_dup+0x68/0x88 sysfs: cannot create duplicate filename '/devices/virtual/bdi/252:2' WARNING: CPU: 1 PID: 179 at kernel-source/lib/kobject.c:240 kobject_add_internal+0x1ec/0x2f8 kobject_add_internal failed for 252:2 with -EEXIST, don't try to register things with the same name in the same directory WARNING: CPU: 1 PID: 179 at kernel-source/fs/sysfs/dir.c:31 sysfs_warn_dup+0x68/0x88 sysfs: cannot create duplicate filename '/dev/block/252:2' However, device_add_disk does not error out when bdi_register_owner returns an error. Control continues until reaching blk_register_queue. It then BUGs. kernel BUG at kernel-source/fs/sysfs/group.c:113! [<c01e26cc>] (internal_create_group) from [<c01e2950>] (sysfs_create_group+0x20/0x24) [<c01e2950>] (sysfs_create_group) from [<c00e3d38>] (blk_trace_init_sysfs+0x18/0x20) [<c00e3d38>] (blk_trace_init_sysfs) from [<c02bdfbc>] (blk_register_queue+0xd8/0x154) [<c02bdfbc>] (blk_register_queue) from [<c02cec84>] (device_add_disk+0x194/0x44c) [<c02cec84>] (device_add_disk) from [<c0436ec8>] (ubiblock_create+0x284/0x2e0) [<c0436ec8>] (ubiblock_create) from [<c0427bb8>] (vol_cdev_ioctl+0x450/0x554) [<c0427bb8>] (vol_cdev_ioctl) from [<c0189110>] (vfs_ioctl+0x30/0x44) [<c0189110>] (vfs_ioctl) from [<c01892e0>] (do_vfs_ioctl+0xa0/0x790) [<c01892e0>] (do_vfs_ioctl) from [<c0189a14>] (SyS_ioctl+0x44/0x68) [<c0189a14>] (SyS_ioctl) from [<c0010640>] (ret_fast_syscall+0x0/0x34) Locking idr_alloc/idr_remove removes the race and keeps gd->first_minor unique. Fixes: 2bf50d42f3a4 ("UBI: block: Dynamically allocate minor numbers") Cc: stable(a)vger.kernel.org Signed-off-by: Bradley Bolen <bradleybolen(a)gmail.com> --- drivers/mtd/ubi/block.c | 42 ++++++++++++++++++++++++++---------------- 1 file changed, 26 insertions(+), 16 deletions(-) diff --git a/drivers/mtd/ubi/block.c b/drivers/mtd/ubi/block.c index b210fdb..b1fc28f 100644 --- a/drivers/mtd/ubi/block.c +++ b/drivers/mtd/ubi/block.c @@ -99,6 +99,8 @@ struct ubiblock { /* Linked list of all ubiblock instances */ static LIST_HEAD(ubiblock_devices); +static DEFINE_IDR(ubiblock_minor_idr); +/* Protects ubiblock_devices and ubiblock_minor_idr */ static DEFINE_MUTEX(devices_mutex); static int ubiblock_major; @@ -351,8 +353,6 @@ static int ubiblock_init_request(struct blk_mq_tag_set *set, .init_request = ubiblock_init_request, }; -static DEFINE_IDR(ubiblock_minor_idr); - int ubiblock_create(struct ubi_volume_info *vi) { struct ubiblock *dev; @@ -365,14 +365,15 @@ int ubiblock_create(struct ubi_volume_info *vi) /* Check that the volume isn't already handled */ mutex_lock(&devices_mutex); if (find_dev_nolock(vi->ubi_num, vi->vol_id)) { - mutex_unlock(&devices_mutex); - return -EEXIST; + ret = -EEXIST; + goto out_unlock; } - mutex_unlock(&devices_mutex); dev = kzalloc(sizeof(struct ubiblock), GFP_KERNEL); - if (!dev) - return -ENOMEM; + if (!dev) { + ret = -ENOMEM; + goto out_unlock; + } mutex_init(&dev->dev_mutex); @@ -437,14 +438,13 @@ int ubiblock_create(struct ubi_volume_info *vi) goto out_free_queue; } - mutex_lock(&devices_mutex); list_add_tail(&dev->list, &ubiblock_devices); - mutex_unlock(&devices_mutex); /* Must be the last step: anyone can call file ops from now on */ add_disk(dev->gd); dev_info(disk_to_dev(dev->gd), "created from ubi%d:%d(%s)", dev->ubi_num, dev->vol_id, vi->name); + mutex_unlock(&devices_mutex); return 0; out_free_queue: @@ -457,6 +457,8 @@ int ubiblock_create(struct ubi_volume_info *vi) put_disk(dev->gd); out_free_dev: kfree(dev); +out_unlock: + mutex_unlock(&devices_mutex); return ret; } @@ -478,30 +480,36 @@ static void ubiblock_cleanup(struct ubiblock *dev) int ubiblock_remove(struct ubi_volume_info *vi) { struct ubiblock *dev; + int ret; mutex_lock(&devices_mutex); dev = find_dev_nolock(vi->ubi_num, vi->vol_id); if (!dev) { - mutex_unlock(&devices_mutex); - return -ENODEV; + ret = -ENODEV; + goto out_unlock; } /* Found a device, let's lock it so we can check if it's busy */ mutex_lock(&dev->dev_mutex); if (dev->refcnt > 0) { - mutex_unlock(&dev->dev_mutex); - mutex_unlock(&devices_mutex); - return -EBUSY; + ret = -EBUSY; + goto out_unlock_dev; } /* Remove from device list */ list_del(&dev->list); - mutex_unlock(&devices_mutex); - ubiblock_cleanup(dev); mutex_unlock(&dev->dev_mutex); + mutex_unlock(&devices_mutex); + kfree(dev); return 0; + +out_unlock_dev: + mutex_unlock(&dev->dev_mutex); +out_unlock: + mutex_unlock(&devices_mutex); + return ret; } static int ubiblock_resize(struct ubi_volume_info *vi) @@ -630,6 +638,7 @@ static void ubiblock_remove_all(void) struct ubiblock *next; struct ubiblock *dev; + mutex_lock(&devices_mutex); list_for_each_entry_safe(dev, next, &ubiblock_devices, list) { /* The module is being forcefully removed */ WARN_ON(dev->desc); @@ -638,6 +647,7 @@ static void ubiblock_remove_all(void) ubiblock_cleanup(dev); kfree(dev); } + mutex_unlock(&devices_mutex); } int __init ubiblock_init(void) -- 1.9.3

7 years, 8 months

3
2
0 0

[Linux-stable-mirror] Patch "powerpc/pseries: Query hypervisor for RFI flush settings" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled powerpc/pseries: Query hypervisor for RFI flush settings to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: powerpc-pseries-query-hypervisor-for-rfi-flush-settings.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 8989d56878a7735dfdb234707a2fee6faf631085 Mon Sep 17 00:00:00 2001 From: Michael Neuling <mikey(a)neuling.org> Date: Wed, 10 Jan 2018 03:07:15 +1100 Subject: powerpc/pseries: Query hypervisor for RFI flush settings From: Michael Neuling <mikey(a)neuling.org> commit 8989d56878a7735dfdb234707a2fee6faf631085 upstream. A new hypervisor call is available which tells the guest settings related to the RFI flush. Use it to query the appropriate flush instruction(s), and whether the flush is required. Signed-off-by: Michael Neuling <mikey(a)neuling.org> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/powerpc/platforms/pseries/setup.c | 35 +++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -459,6 +459,39 @@ static void __init find_and_init_phbs(vo of_pci_check_probe_only(); } +static void pseries_setup_rfi_flush(void) +{ + struct h_cpu_char_result result; + enum l1d_flush_type types; + bool enable; + long rc; + + /* Enable by default */ + enable = true; + + rc = plpar_get_cpu_characteristics(&result); + if (rc == H_SUCCESS) { + types = L1D_FLUSH_NONE; + + if (result.character & H_CPU_CHAR_L1D_FLUSH_TRIG2) + types |= L1D_FLUSH_MTTRIG; + if (result.character & H_CPU_CHAR_L1D_FLUSH_ORI30) + types |= L1D_FLUSH_ORI; + + /* Use fallback if nothing set in hcall */ + if (types == L1D_FLUSH_NONE) + types = L1D_FLUSH_FALLBACK; + + if (!(result.behaviour & H_CPU_BEHAV_L1D_FLUSH_PR)) + enable = false; + } else { + /* Default to fallback if case hcall is not available */ + types = L1D_FLUSH_FALLBACK; + } + + setup_rfi_flush(types, enable); +} + static void __init pSeries_setup_arch(void) { set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT); @@ -476,6 +509,8 @@ static void __init pSeries_setup_arch(vo fwnmi_init(); + pseries_setup_rfi_flush(); + /* By default, only probe PCI (can be overridden by rtas_pci) */ pci_add_flags(PCI_PROBE_ONLY); Patches currently in stable-queue which might be from mikey(a)neuling.org are queue-4.14/powerpc-pseries-add-h_get_cpu_characteristics-flags-wrapper.patch queue-4.14/powerpc-pseries-query-hypervisor-for-rfi-flush-settings.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "powerpc/powernv: Check device-tree for RFI flush settings" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled powerpc/powernv: Check device-tree for RFI flush settings to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: powerpc-powernv-check-device-tree-for-rfi-flush-settings.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 6e032b350cd1fdb830f18f8320ef0e13b4e24094 Mon Sep 17 00:00:00 2001 From: Oliver O'Halloran <oohall(a)gmail.com> Date: Wed, 10 Jan 2018 03:07:15 +1100 Subject: powerpc/powernv: Check device-tree for RFI flush settings From: Oliver O'Halloran <oohall(a)gmail.com> commit 6e032b350cd1fdb830f18f8320ef0e13b4e24094 upstream. New device-tree properties are available which tell the hypervisor settings related to the RFI flush. Use them to determine the appropriate flush instruction to use, and whether the flush is required. Signed-off-by: Oliver O'Halloran <oohall(a)gmail.com> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/powerpc/platforms/powernv/setup.c | 49 +++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) --- a/arch/powerpc/platforms/powernv/setup.c +++ b/arch/powerpc/platforms/powernv/setup.c @@ -36,13 +36,62 @@ #include <asm/opal.h> #include <asm/kexec.h> #include <asm/smp.h> +#include <asm/setup.h> #include "powernv.h" +static void pnv_setup_rfi_flush(void) +{ + struct device_node *np, *fw_features; + enum l1d_flush_type type; + int enable; + + /* Default to fallback in case fw-features are not available */ + type = L1D_FLUSH_FALLBACK; + enable = 1; + + np = of_find_node_by_name(NULL, "ibm,opal"); + fw_features = of_get_child_by_name(np, "fw-features"); + of_node_put(np); + + if (fw_features) { + np = of_get_child_by_name(fw_features, "inst-l1d-flush-trig2"); + if (np && of_property_read_bool(np, "enabled")) + type = L1D_FLUSH_MTTRIG; + + of_node_put(np); + + np = of_get_child_by_name(fw_features, "inst-l1d-flush-ori30,30,0"); + if (np && of_property_read_bool(np, "enabled")) + type = L1D_FLUSH_ORI; + + of_node_put(np); + + /* Enable unless firmware says NOT to */ + enable = 2; + np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-hv-1-to-0"); + if (np && of_property_read_bool(np, "disabled")) + enable--; + + of_node_put(np); + + np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-pr-0-to-1"); + if (np && of_property_read_bool(np, "disabled")) + enable--; + + of_node_put(np); + of_node_put(fw_features); + } + + setup_rfi_flush(type, enable > 0); +} + static void __init pnv_setup_arch(void) { set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT); + pnv_setup_rfi_flush(); + /* Initialize SMP */ pnv_smp_init(); Patches currently in stable-queue which might be from oohall(a)gmail.com are queue-4.14/powerpc-powernv-check-device-tree-for-rfi-flush-settings.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "powerpc/64s: Simple RFI macro conversions" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled powerpc/64s: Simple RFI macro conversions to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: powerpc-64s-simple-rfi-macro-conversions.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 222f20f140623ef6033491d0103ee0875fe87d35 Mon Sep 17 00:00:00 2001 From: Nicholas Piggin <npiggin(a)gmail.com> Date: Wed, 10 Jan 2018 03:07:15 +1100 Subject: powerpc/64s: Simple RFI macro conversions From: Nicholas Piggin <npiggin(a)gmail.com> commit 222f20f140623ef6033491d0103ee0875fe87d35 upstream. This commit does simple conversions of rfi/rfid to the new macros that include the expected destination context. By simple we mean cases where there is a single well known destination context, and it's simply a matter of substituting the instruction for the appropriate macro. Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/powerpc/include/asm/exception-64s.h | 4 ++-- arch/powerpc/kernel/entry_64.S | 14 +++++++++----- arch/powerpc/kernel/exceptions-64s.S | 22 +++++++++++----------- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 7 +++---- arch/powerpc/kvm/book3s_rmhandlers.S | 7 +++++-- arch/powerpc/kvm/book3s_segment.S | 4 ++-- 6 files changed, 32 insertions(+), 26 deletions(-) --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -242,7 +242,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) mtspr SPRN_##h##SRR0,r12; \ mfspr r12,SPRN_##h##SRR1; /* and SRR1 */ \ mtspr SPRN_##h##SRR1,r10; \ - h##rfid; \ + h##RFI_TO_KERNEL; \ b . /* prevent speculative execution */ #define EXCEPTION_PROLOG_PSERIES_1(label, h) \ __EXCEPTION_PROLOG_PSERIES_1(label, h) @@ -256,7 +256,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) mtspr SPRN_##h##SRR0,r12; \ mfspr r12,SPRN_##h##SRR1; /* and SRR1 */ \ mtspr SPRN_##h##SRR1,r10; \ - h##rfid; \ + h##RFI_TO_KERNEL; \ b . /* prevent speculative execution */ #define EXCEPTION_PROLOG_PSERIES_1_NORI(label, h) \ --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -37,6 +37,11 @@ #include <asm/tm.h> #include <asm/ppc-opcode.h> #include <asm/export.h> +#ifdef CONFIG_PPC_BOOK3S +#include <asm/exception-64s.h> +#else +#include <asm/exception-64e.h> +#endif /* * System calls. @@ -397,8 +402,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) mtmsrd r10, 1 mtspr SPRN_SRR0, r11 mtspr SPRN_SRR1, r12 - - rfid + RFI_TO_USER b . /* prevent speculative execution */ #endif _ASM_NOKPROBE_SYMBOL(system_call_common); @@ -1073,7 +1077,7 @@ __enter_rtas: mtspr SPRN_SRR0,r5 mtspr SPRN_SRR1,r6 - rfid + RFI_TO_KERNEL b . /* prevent speculative execution */ rtas_return_loc: @@ -1098,7 +1102,7 @@ rtas_return_loc: mtspr SPRN_SRR0,r3 mtspr SPRN_SRR1,r4 - rfid + RFI_TO_KERNEL b . /* prevent speculative execution */ _ASM_NOKPROBE_SYMBOL(__enter_rtas) _ASM_NOKPROBE_SYMBOL(rtas_return_loc) @@ -1171,7 +1175,7 @@ _GLOBAL(enter_prom) LOAD_REG_IMMEDIATE(r12, MSR_SF | MSR_ISF | MSR_LE) andc r11,r11,r12 mtsrr1 r11 - rfid + RFI_TO_KERNEL #endif /* CONFIG_PPC_BOOK3E */ 1: /* Return from OF */ --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -254,7 +254,7 @@ BEGIN_FTR_SECTION LOAD_HANDLER(r12, machine_check_handle_early) 1: mtspr SPRN_SRR0,r12 mtspr SPRN_SRR1,r11 - rfid + RFI_TO_KERNEL b . /* prevent speculative execution */ 2: /* Stack overflow. Stay on emergency stack and panic. @@ -443,7 +443,7 @@ EXC_COMMON_BEGIN(machine_check_handle_ea li r3,MSR_ME andc r10,r10,r3 /* Turn off MSR_ME */ mtspr SPRN_SRR1,r10 - rfid + RFI_TO_KERNEL b . 2: /* @@ -461,7 +461,7 @@ EXC_COMMON_BEGIN(machine_check_handle_ea */ bl machine_check_queue_event MACHINE_CHECK_HANDLER_WINDUP - rfid + RFI_TO_USER_OR_KERNEL 9: /* Deliver the machine check to host kernel in V mode. */ MACHINE_CHECK_HANDLER_WINDUP @@ -649,7 +649,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_R mtspr SPRN_SRR0,r10 ld r10,PACAKMSR(r13) mtspr SPRN_SRR1,r10 - rfid + RFI_TO_KERNEL b . 8: std r3,PACA_EXSLB+EX_DAR(r13) @@ -660,7 +660,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_R mtspr SPRN_SRR0,r10 ld r10,PACAKMSR(r13) mtspr SPRN_SRR1,r10 - rfid + RFI_TO_KERNEL b . EXC_COMMON_BEGIN(unrecov_slb) @@ -905,7 +905,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE) mtspr SPRN_SRR0,r10 ; \ ld r10,PACAKMSR(r13) ; \ mtspr SPRN_SRR1,r10 ; \ - rfid ; \ + RFI_TO_KERNEL ; \ b . ; /* prevent speculative execution */ #define SYSCALL_FASTENDIAN \ @@ -914,7 +914,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE) xori r12,r12,MSR_LE ; \ mtspr SPRN_SRR1,r12 ; \ mr r13,r9 ; \ - rfid ; /* return to userspace */ \ + RFI_TO_USER ; /* return to userspace */ \ b . ; /* prevent speculative execution */ #if defined(CONFIG_RELOCATABLE) @@ -1299,7 +1299,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) ld r11,PACA_EXGEN+EX_R11(r13) ld r12,PACA_EXGEN+EX_R12(r13) ld r13,PACA_EXGEN+EX_R13(r13) - HRFID + HRFI_TO_UNKNOWN b . #endif @@ -1403,7 +1403,7 @@ masked_##_H##interrupt: \ ld r10,PACA_EXGEN+EX_R10(r13); \ ld r11,PACA_EXGEN+EX_R11(r13); \ /* returns to kernel where r13 must be set up, so don't restore it */ \ - ##_H##rfid; \ + ##_H##RFI_TO_KERNEL; \ b .; \ MASKED_DEC_HANDLER(_H) @@ -1426,7 +1426,7 @@ TRAMP_REAL_BEGIN(kvmppc_skip_interrupt) addi r13, r13, 4 mtspr SPRN_SRR0, r13 GET_SCRATCH0(r13) - rfid + RFI_TO_KERNEL b . TRAMP_REAL_BEGIN(kvmppc_skip_Hinterrupt) @@ -1438,7 +1438,7 @@ TRAMP_REAL_BEGIN(kvmppc_skip_Hinterrupt) addi r13, r13, 4 mtspr SPRN_HSRR0, r13 GET_SCRATCH0(r13) - hrfid + HRFI_TO_KERNEL b . #endif --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -78,7 +78,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) mtmsrd r0,1 /* clear RI in MSR */ mtsrr0 r5 mtsrr1 r6 - RFI + RFI_TO_KERNEL kvmppc_call_hv_entry: ld r4, HSTATE_KVM_VCPU(r13) @@ -187,7 +187,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300) mtmsrd r6, 1 /* Clear RI in MSR */ mtsrr0 r8 mtsrr1 r7 - RFI + RFI_TO_KERNEL /* Virtual-mode return */ .Lvirt_return: @@ -1131,8 +1131,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) ld r0, VCPU_GPR(R0)(r4) ld r4, VCPU_GPR(R4)(r4) - - hrfid + HRFI_TO_GUEST b . secondary_too_late: --- a/arch/powerpc/kvm/book3s_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_rmhandlers.S @@ -46,6 +46,9 @@ #define FUNC(name) name +#define RFI_TO_KERNEL RFI +#define RFI_TO_GUEST RFI + .macro INTERRUPT_TRAMPOLINE intno .global kvmppc_trampoline_\intno @@ -141,7 +144,7 @@ kvmppc_handler_skip_ins: GET_SCRATCH0(r13) /* And get back into the code */ - RFI + RFI_TO_KERNEL #endif /* @@ -164,6 +167,6 @@ _GLOBAL_TOC(kvmppc_entry_trampoline) ori r5, r5, MSR_EE mtsrr0 r7 mtsrr1 r6 - RFI + RFI_TO_KERNEL #include "book3s_segment.S" --- a/arch/powerpc/kvm/book3s_segment.S +++ b/arch/powerpc/kvm/book3s_segment.S @@ -156,7 +156,7 @@ no_dcbz32_on: PPC_LL r9, SVCPU_R9(r3) PPC_LL r3, (SVCPU_R3)(r3) - RFI + RFI_TO_GUEST kvmppc_handler_trampoline_enter_end: @@ -407,5 +407,5 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE) cmpwi r12, BOOK3S_INTERRUPT_DOORBELL beqa BOOK3S_INTERRUPT_DOORBELL - RFI + RFI_TO_KERNEL kvmppc_handler_trampoline_exit_end: Patches currently in stable-queue which might be from npiggin(a)gmail.com are queue-4.14/powerpc-64s-simple-rfi-macro-conversions.patch queue-4.14/powerpc-64-add-macros-for-annotating-the-destination-of-rfid-hrfid.patch queue-4.14/powerpc-64s-convert-slb_miss_common-to-use-rfi_to_user-kernel.patch queue-4.14/powerpc-64-convert-the-syscall-exit-path-to-use-rfi_to_user-kernel.patch queue-4.14/powerpc-64s-add-support-for-rfi-flush-of-l1-d-cache.patch queue-4.14/powerpc-64-convert-fast_exception_return-to-use-rfi_to_user-kernel.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: powerpc-64s-support-disabling-rfi-flush-with-no_rfi_flush-and-nopti.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From bc9c9304a45480797e13a8e1df96ffcf44fb62fe Mon Sep 17 00:00:00 2001 From: Michael Ellerman <mpe(a)ellerman.id.au> Date: Wed, 10 Jan 2018 03:07:15 +1100 Subject: powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti From: Michael Ellerman <mpe(a)ellerman.id.au> commit bc9c9304a45480797e13a8e1df96ffcf44fb62fe upstream. Because there may be some performance overhead of the RFI flush, add kernel command line options to disable it. We add a sensibly named 'no_rfi_flush' option, but we also hijack the x86 option 'nopti'. The RFI flush is not the same as KPTI, but if we see 'nopti' we can guess that the user is trying to avoid any overhead of Meltdown mitigations, and it means we don't have to educate every one about a different command line option. Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/powerpc/kernel/setup_64.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-) --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -788,8 +788,29 @@ early_initcall(disable_hardlockup_detect #ifdef CONFIG_PPC_BOOK3S_64 static enum l1d_flush_type enabled_flush_types; static void *l1d_flush_fallback_area; +static bool no_rfi_flush; bool rfi_flush; +static int __init handle_no_rfi_flush(char *p) +{ + pr_info("rfi-flush: disabled on command line."); + no_rfi_flush = true; + return 0; +} +early_param("no_rfi_flush", handle_no_rfi_flush); + +/* + * The RFI flush is not KPTI, but because users will see doco that says to use + * nopti we hijack that option here to also disable the RFI flush. + */ +static int __init handle_no_pti(char *p) +{ + pr_info("rfi-flush: disabling due to 'nopti' on command line.\n"); + handle_no_rfi_flush(NULL); + return 0; +} +early_param("nopti", handle_no_pti); + static void do_nothing(void *unused) { /* @@ -860,6 +881,7 @@ void __init setup_rfi_flush(enum l1d_flu enabled_flush_types = types; - rfi_flush_enable(enable); + if (!no_rfi_flush) + rfi_flush_enable(enable); } #endif /* CONFIG_PPC_BOOK3S_64 */ Patches currently in stable-queue which might be from mpe(a)ellerman.id.au are queue-4.14/powerpc-64s-simple-rfi-macro-conversions.patch queue-4.14/powerpc-64-add-macros-for-annotating-the-destination-of-rfid-hrfid.patch queue-4.14/powerpc-pseries-add-h_get_cpu_characteristics-flags-wrapper.patch queue-4.14/powerpc-powernv-check-device-tree-for-rfi-flush-settings.patch queue-4.14/powerpc-64s-convert-slb_miss_common-to-use-rfi_to_user-kernel.patch queue-4.14/powerpc-64s-support-disabling-rfi-flush-with-no_rfi_flush-and-nopti.patch queue-4.14/powerpc-64-convert-the-syscall-exit-path-to-use-rfi_to_user-kernel.patch queue-4.14/powerpc-64s-add-support-for-rfi-flush-of-l1-d-cache.patch queue-4.14/powerpc-pseries-query-hypervisor-for-rfi-flush-settings.patch queue-4.14/powerpc-64-convert-fast_exception_return-to-use-rfi_to_user-kernel.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: powerpc-64s-convert-slb_miss_common-to-use-rfi_to_user-kernel.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From c7305645eb0c1621351cfc104038831ae87c0053 Mon Sep 17 00:00:00 2001 From: Nicholas Piggin <npiggin(a)gmail.com> Date: Wed, 10 Jan 2018 03:07:15 +1100 Subject: powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL From: Nicholas Piggin <npiggin(a)gmail.com> commit c7305645eb0c1621351cfc104038831ae87c0053 upstream. In the SLB miss handler we may be returning to user or kernel. We need to add a check early on and save the result in the cr4 register, and then we bifurcate the return path based on that. Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/powerpc/kernel/exceptions-64s.S | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -596,6 +596,9 @@ EXC_COMMON_BEGIN(slb_miss_common) stw r9,PACA_EXSLB+EX_CCR(r13) /* save CR in exc. frame */ std r10,PACA_EXSLB+EX_LR(r13) /* save LR */ + andi. r9,r11,MSR_PR // Check for exception from userspace + cmpdi cr4,r9,MSR_PR // And save the result in CR4 for later + /* * Test MSR_RI before calling slb_allocate_realmode, because the * MSR in r11 gets clobbered. However we still want to allocate @@ -622,9 +625,32 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_R /* All done -- return from exception. */ + bne cr4,1f /* returning to kernel */ + +.machine push +.machine "power4" + mtcrf 0x80,r9 + mtcrf 0x08,r9 /* MSR[PR] indication is in cr4 */ + mtcrf 0x04,r9 /* MSR[RI] indication is in cr5 */ + mtcrf 0x02,r9 /* I/D indication is in cr6 */ + mtcrf 0x01,r9 /* slb_allocate uses cr0 and cr7 */ +.machine pop + + RESTORE_CTR(r9, PACA_EXSLB) + RESTORE_PPR_PACA(PACA_EXSLB, r9) + mr r3,r12 + ld r9,PACA_EXSLB+EX_R9(r13) + ld r10,PACA_EXSLB+EX_R10(r13) + ld r11,PACA_EXSLB+EX_R11(r13) + ld r12,PACA_EXSLB+EX_R12(r13) + ld r13,PACA_EXSLB+EX_R13(r13) + RFI_TO_USER + b . /* prevent speculative execution */ +1: .machine push .machine "power4" mtcrf 0x80,r9 + mtcrf 0x08,r9 /* MSR[PR] indication is in cr4 */ mtcrf 0x04,r9 /* MSR[RI] indication is in cr5 */ mtcrf 0x02,r9 /* I/D indication is in cr6 */ mtcrf 0x01,r9 /* slb_allocate uses cr0 and cr7 */ @@ -638,9 +664,10 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_R ld r11,PACA_EXSLB+EX_R11(r13) ld r12,PACA_EXSLB+EX_R12(r13) ld r13,PACA_EXSLB+EX_R13(r13) - rfid + RFI_TO_KERNEL b . /* prevent speculative execution */ + 2: std r3,PACA_EXSLB+EX_DAR(r13) mr r3,r12 mfspr r11,SPRN_SRR0 Patches currently in stable-queue which might be from npiggin(a)gmail.com are queue-4.14/powerpc-64s-simple-rfi-macro-conversions.patch queue-4.14/powerpc-64-add-macros-for-annotating-the-destination-of-rfid-hrfid.patch queue-4.14/powerpc-64s-convert-slb_miss_common-to-use-rfi_to_user-kernel.patch queue-4.14/powerpc-64-convert-the-syscall-exit-path-to-use-rfi_to_user-kernel.patch queue-4.14/powerpc-64s-add-support-for-rfi-flush-of-l1-d-cache.patch queue-4.14/powerpc-64-convert-fast_exception_return-to-use-rfi_to_user-kernel.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "powerpc/64s: Add support for RFI flush of L1-D cache" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled powerpc/64s: Add support for RFI flush of L1-D cache to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: powerpc-64s-add-support-for-rfi-flush-of-l1-d-cache.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From aa8a5e0062ac940f7659394f4817c948dc8c0667 Mon Sep 17 00:00:00 2001 From: Michael Ellerman <mpe(a)ellerman.id.au> Date: Wed, 10 Jan 2018 03:07:15 +1100 Subject: powerpc/64s: Add support for RFI flush of L1-D cache From: Michael Ellerman <mpe(a)ellerman.id.au> commit aa8a5e0062ac940f7659394f4817c948dc8c0667 upstream. On some CPUs we can prevent the Meltdown vulnerability by flushing the L1-D cache on exit from kernel to user mode, and from hypervisor to guest. This is known to be the case on at least Power7, Power8 and Power9. At this time we do not know the status of the vulnerability on other CPUs such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale CPUs. As more information comes to light we can enable this, or other mechanisms on those CPUs. The vulnerability occurs when the load of an architecturally inaccessible memory region (eg. userspace load of kernel memory) is speculatively executed to the point where its result can influence the address of a subsequent speculatively executed load. In order for that to happen, the first load must hit in the L1, because before the load is sent to the L2 the permission check is performed. Therefore if no kernel addresses hit in the L1 the vulnerability can not occur. We can ensure that is the case by flushing the L1 whenever we return to userspace. Similarly for hypervisor vs guest. In order to flush the L1-D cache on exit, we add a section of nops at each (h)rfi location that returns to a lower privileged context, and patch that with some sequence. Newer firmwares are able to advertise to us that there is a special nop instruction that flushes the L1-D. If we do not see that advertised, we fall back to doing a displacement flush in software. For guest kernels we support migration between some CPU versions, and different CPUs may use different flush instructions. So that we are prepared to migrate to a machine with a different flush instruction activated, we may have to patch more than one flush instruction at boot if the hypervisor tells us to. In the end this patch is mostly the work of Nicholas Piggin and Michael Ellerman. However a cast of thousands contributed to analysis of the issue, earlier versions of the patch, back ports testing etc. Many thanks to all of them. Tested-by: Jon Masters <jcm(a)redhat.com> Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/powerpc/include/asm/exception-64s.h | 40 +++++++++++--- arch/powerpc/include/asm/feature-fixups.h | 13 ++++ arch/powerpc/include/asm/paca.h | 10 +++ arch/powerpc/include/asm/setup.h | 13 ++++ arch/powerpc/kernel/asm-offsets.c | 5 + arch/powerpc/kernel/exceptions-64s.S | 84 ++++++++++++++++++++++++++++++ arch/powerpc/kernel/setup_64.c | 79 ++++++++++++++++++++++++++++ arch/powerpc/kernel/vmlinux.lds.S | 9 +++ arch/powerpc/lib/feature-fixups.c | 41 ++++++++++++++ 9 files changed, 286 insertions(+), 8 deletions(-) --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -69,34 +69,58 @@ */ #define EX_R3 EX_DAR -/* Macros for annotating the expected destination of (h)rfid */ +/* + * Macros for annotating the expected destination of (h)rfid + * + * The nop instructions allow us to insert one or more instructions to flush the + * L1-D cache when returning to userspace or a guest. + */ +#define RFI_FLUSH_SLOT \ + RFI_FLUSH_FIXUP_SECTION; \ + nop; \ + nop; \ + nop #define RFI_TO_KERNEL \ rfid #define RFI_TO_USER \ - rfid + RFI_FLUSH_SLOT; \ + rfid; \ + b rfi_flush_fallback #define RFI_TO_USER_OR_KERNEL \ - rfid + RFI_FLUSH_SLOT; \ + rfid; \ + b rfi_flush_fallback #define RFI_TO_GUEST \ - rfid + RFI_FLUSH_SLOT; \ + rfid; \ + b rfi_flush_fallback #define HRFI_TO_KERNEL \ hrfid #define HRFI_TO_USER \ - hrfid + RFI_FLUSH_SLOT; \ + hrfid; \ + b hrfi_flush_fallback #define HRFI_TO_USER_OR_KERNEL \ - hrfid + RFI_FLUSH_SLOT; \ + hrfid; \ + b hrfi_flush_fallback #define HRFI_TO_GUEST \ - hrfid + RFI_FLUSH_SLOT; \ + hrfid; \ + b hrfi_flush_fallback #define HRFI_TO_UNKNOWN \ - hrfid + RFI_FLUSH_SLOT; \ + hrfid; \ + b hrfi_flush_fallback #ifdef CONFIG_RELOCATABLE #define __EXCEPTION_RELON_PROLOG_PSERIES_1(label, h) \ --- a/arch/powerpc/include/asm/feature-fixups.h +++ b/arch/powerpc/include/asm/feature-fixups.h @@ -187,7 +187,20 @@ label##3: \ FTR_ENTRY_OFFSET label##1b-label##3b; \ .popsection; +#define RFI_FLUSH_FIXUP_SECTION \ +951: \ + .pushsection __rfi_flush_fixup,"a"; \ + .align 2; \ +952: \ + FTR_ENTRY_OFFSET 951b-952b; \ + .popsection; + + #ifndef __ASSEMBLY__ +#include <linux/types.h> + +extern long __start___rfi_flush_fixup, __stop___rfi_flush_fixup; + void apply_feature_fixups(void); void setup_feature_keys(void); #endif --- a/arch/powerpc/include/asm/paca.h +++ b/arch/powerpc/include/asm/paca.h @@ -231,6 +231,16 @@ struct paca_struct { struct sibling_subcore_state *sibling_subcore_state; #endif #endif +#ifdef CONFIG_PPC_BOOK3S_64 + /* + * rfi fallback flush must be in its own cacheline to prevent + * other paca data leaking into the L1d + */ + u64 exrfi[EX_SIZE] __aligned(0x80); + void *rfi_flush_fallback_area; + u64 l1d_flush_congruence; + u64 l1d_flush_sets; +#endif }; extern void copy_mm_to_paca(struct mm_struct *mm); --- a/arch/powerpc/include/asm/setup.h +++ b/arch/powerpc/include/asm/setup.h @@ -39,6 +39,19 @@ static inline void pseries_big_endian_ex static inline void pseries_little_endian_exceptions(void) {} #endif /* CONFIG_PPC_PSERIES */ +void rfi_flush_enable(bool enable); + +/* These are bit flags */ +enum l1d_flush_type { + L1D_FLUSH_NONE = 0x1, + L1D_FLUSH_FALLBACK = 0x2, + L1D_FLUSH_ORI = 0x4, + L1D_FLUSH_MTTRIG = 0x8, +}; + +void __init setup_rfi_flush(enum l1d_flush_type, bool enable); +void do_rfi_flush_fixups(enum l1d_flush_type types); + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_POWERPC_SETUP_H */ --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -237,6 +237,11 @@ int main(void) OFFSET(PACA_NMI_EMERG_SP, paca_struct, nmi_emergency_sp); OFFSET(PACA_IN_MCE, paca_struct, in_mce); OFFSET(PACA_IN_NMI, paca_struct, in_nmi); + OFFSET(PACA_RFI_FLUSH_FALLBACK_AREA, paca_struct, rfi_flush_fallback_area); + OFFSET(PACA_EXRFI, paca_struct, exrfi); + OFFSET(PACA_L1D_FLUSH_CONGRUENCE, paca_struct, l1d_flush_congruence); + OFFSET(PACA_L1D_FLUSH_SETS, paca_struct, l1d_flush_sets); + #endif OFFSET(PACAHWCPUID, paca_struct, hw_cpu_id); OFFSET(PACAKEXECSTATE, paca_struct, kexec_state); --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -1434,6 +1434,90 @@ masked_##_H##interrupt: \ b .; \ MASKED_DEC_HANDLER(_H) +TRAMP_REAL_BEGIN(rfi_flush_fallback) + SET_SCRATCH0(r13); + GET_PACA(r13); + std r9,PACA_EXRFI+EX_R9(r13) + std r10,PACA_EXRFI+EX_R10(r13) + std r11,PACA_EXRFI+EX_R11(r13) + std r12,PACA_EXRFI+EX_R12(r13) + std r8,PACA_EXRFI+EX_R13(r13) + mfctr r9 + ld r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13) + ld r11,PACA_L1D_FLUSH_SETS(r13) + ld r12,PACA_L1D_FLUSH_CONGRUENCE(r13) + /* + * The load adresses are at staggered offsets within cachelines, + * which suits some pipelines better (on others it should not + * hurt). + */ + addi r12,r12,8 + mtctr r11 + DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */ + + /* order ld/st prior to dcbt stop all streams with flushing */ + sync +1: li r8,0 + .rept 8 /* 8-way set associative */ + ldx r11,r10,r8 + add r8,r8,r12 + xor r11,r11,r11 // Ensure r11 is 0 even if fallback area is not + add r8,r8,r11 // Add 0, this creates a dependency on the ldx + .endr + addi r10,r10,128 /* 128 byte cache line */ + bdnz 1b + + mtctr r9 + ld r9,PACA_EXRFI+EX_R9(r13) + ld r10,PACA_EXRFI+EX_R10(r13) + ld r11,PACA_EXRFI+EX_R11(r13) + ld r12,PACA_EXRFI+EX_R12(r13) + ld r8,PACA_EXRFI+EX_R13(r13) + GET_SCRATCH0(r13); + rfid + +TRAMP_REAL_BEGIN(hrfi_flush_fallback) + SET_SCRATCH0(r13); + GET_PACA(r13); + std r9,PACA_EXRFI+EX_R9(r13) + std r10,PACA_EXRFI+EX_R10(r13) + std r11,PACA_EXRFI+EX_R11(r13) + std r12,PACA_EXRFI+EX_R12(r13) + std r8,PACA_EXRFI+EX_R13(r13) + mfctr r9 + ld r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13) + ld r11,PACA_L1D_FLUSH_SETS(r13) + ld r12,PACA_L1D_FLUSH_CONGRUENCE(r13) + /* + * The load adresses are at staggered offsets within cachelines, + * which suits some pipelines better (on others it should not + * hurt). + */ + addi r12,r12,8 + mtctr r11 + DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */ + + /* order ld/st prior to dcbt stop all streams with flushing */ + sync +1: li r8,0 + .rept 8 /* 8-way set associative */ + ldx r11,r10,r8 + add r8,r8,r12 + xor r11,r11,r11 // Ensure r11 is 0 even if fallback area is not + add r8,r8,r11 // Add 0, this creates a dependency on the ldx + .endr + addi r10,r10,128 /* 128 byte cache line */ + bdnz 1b + + mtctr r9 + ld r9,PACA_EXRFI+EX_R9(r13) + ld r10,PACA_EXRFI+EX_R10(r13) + ld r11,PACA_EXRFI+EX_R11(r13) + ld r12,PACA_EXRFI+EX_R12(r13) + ld r8,PACA_EXRFI+EX_R13(r13) + GET_SCRATCH0(r13); + hrfid + /* * Real mode exceptions actually use this too, but alternate * instruction code patches (which end up in the common .text area) --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -784,3 +784,82 @@ static int __init disable_hardlockup_det return 0; } early_initcall(disable_hardlockup_detector); + +#ifdef CONFIG_PPC_BOOK3S_64 +static enum l1d_flush_type enabled_flush_types; +static void *l1d_flush_fallback_area; +bool rfi_flush; + +static void do_nothing(void *unused) +{ + /* + * We don't need to do the flush explicitly, just enter+exit kernel is + * sufficient, the RFI exit handlers will do the right thing. + */ +} + +void rfi_flush_enable(bool enable) +{ + if (rfi_flush == enable) + return; + + if (enable) { + do_rfi_flush_fixups(enabled_flush_types); + on_each_cpu(do_nothing, NULL, 1); + } else + do_rfi_flush_fixups(L1D_FLUSH_NONE); + + rfi_flush = enable; +} + +static void init_fallback_flush(void) +{ + u64 l1d_size, limit; + int cpu; + + l1d_size = ppc64_caches.l1d.size; + limit = min(safe_stack_limit(), ppc64_rma_size); + + /* + * Align to L1d size, and size it at 2x L1d size, to catch possible + * hardware prefetch runoff. We don't have a recipe for load patterns to + * reliably avoid the prefetcher. + */ + l1d_flush_fallback_area = __va(memblock_alloc_base(l1d_size * 2, l1d_size, limit)); + memset(l1d_flush_fallback_area, 0, l1d_size * 2); + + for_each_possible_cpu(cpu) { + /* + * The fallback flush is currently coded for 8-way + * associativity. Different associativity is possible, but it + * will be treated as 8-way and may not evict the lines as + * effectively. + * + * 128 byte lines are mandatory. + */ + u64 c = l1d_size / 8; + + paca[cpu].rfi_flush_fallback_area = l1d_flush_fallback_area; + paca[cpu].l1d_flush_congruence = c; + paca[cpu].l1d_flush_sets = c / 128; + } +} + +void __init setup_rfi_flush(enum l1d_flush_type types, bool enable) +{ + if (types & L1D_FLUSH_FALLBACK) { + pr_info("rfi-flush: Using fallback displacement flush\n"); + init_fallback_flush(); + } + + if (types & L1D_FLUSH_ORI) + pr_info("rfi-flush: Using ori type flush\n"); + + if (types & L1D_FLUSH_MTTRIG) + pr_info("rfi-flush: Using mttrig type flush\n"); + + enabled_flush_types = types; + + rfi_flush_enable(enable); +} +#endif /* CONFIG_PPC_BOOK3S_64 */ --- a/arch/powerpc/kernel/vmlinux.lds.S +++ b/arch/powerpc/kernel/vmlinux.lds.S @@ -132,6 +132,15 @@ SECTIONS /* Read-only data */ RO_DATA(PAGE_SIZE) +#ifdef CONFIG_PPC64 + . = ALIGN(8); + __rfi_flush_fixup : AT(ADDR(__rfi_flush_fixup) - LOAD_OFFSET) { + __start___rfi_flush_fixup = .; + *(__rfi_flush_fixup) + __stop___rfi_flush_fixup = .; + } +#endif + EXCEPTION_TABLE(0) NOTES :kernel :notes --- a/arch/powerpc/lib/feature-fixups.c +++ b/arch/powerpc/lib/feature-fixups.c @@ -116,6 +116,47 @@ void do_feature_fixups(unsigned long val } } +#ifdef CONFIG_PPC_BOOK3S_64 +void do_rfi_flush_fixups(enum l1d_flush_type types) +{ + unsigned int instrs[3], *dest; + long *start, *end; + int i; + + start = PTRRELOC(&__start___rfi_flush_fixup), + end = PTRRELOC(&__stop___rfi_flush_fixup); + + instrs[0] = 0x60000000; /* nop */ + instrs[1] = 0x60000000; /* nop */ + instrs[2] = 0x60000000; /* nop */ + + if (types & L1D_FLUSH_FALLBACK) + /* b .+16 to fallback flush */ + instrs[0] = 0x48000010; + + i = 0; + if (types & L1D_FLUSH_ORI) { + instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */ + instrs[i++] = 0x63de0000; /* ori 30,30,0 L1d flush*/ + } + + if (types & L1D_FLUSH_MTTRIG) + instrs[i++] = 0x7c12dba6; /* mtspr TRIG2,r0 (SPR #882) */ + + for (i = 0; start < end; start++, i++) { + dest = (void *)start + *start; + + pr_devel("patching dest %lx\n", (unsigned long)dest); + + patch_instruction(dest, instrs[0]); + patch_instruction(dest + 1, instrs[1]); + patch_instruction(dest + 2, instrs[2]); + } + + printk(KERN_DEBUG "rfi-flush: patched %d locations\n", i); +} +#endif /* CONFIG_PPC_BOOK3S_64 */ + void do_lwsync_fixups(unsigned long value, void *fixup_start, void *fixup_end) { long *start, *end; Patches currently in stable-queue which might be from mpe(a)ellerman.id.au are queue-4.14/powerpc-64s-simple-rfi-macro-conversions.patch queue-4.14/powerpc-64-add-macros-for-annotating-the-destination-of-rfid-hrfid.patch queue-4.14/powerpc-pseries-add-h_get_cpu_characteristics-flags-wrapper.patch queue-4.14/powerpc-powernv-check-device-tree-for-rfi-flush-settings.patch queue-4.14/powerpc-64s-convert-slb_miss_common-to-use-rfi_to_user-kernel.patch queue-4.14/powerpc-64s-support-disabling-rfi-flush-with-no_rfi_flush-and-nopti.patch queue-4.14/powerpc-64-convert-the-syscall-exit-path-to-use-rfi_to_user-kernel.patch queue-4.14/powerpc-64s-add-support-for-rfi-flush-of-l1-d-cache.patch queue-4.14/powerpc-pseries-query-hypervisor-for-rfi-flush-settings.patch queue-4.14/powerpc-64-convert-fast_exception_return-to-use-rfi_to_user-kernel.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: powerpc-64-convert-the-syscall-exit-path-to-use-rfi_to_user-kernel.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From b8e90cb7bc04a509e821e82ab6ed7a8ef11ba333 Mon Sep 17 00:00:00 2001 From: Nicholas Piggin <npiggin(a)gmail.com> Date: Wed, 10 Jan 2018 03:07:15 +1100 Subject: powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL From: Nicholas Piggin <npiggin(a)gmail.com> commit b8e90cb7bc04a509e821e82ab6ed7a8ef11ba333 upstream. In the syscall exit path we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that. Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/powerpc/kernel/entry_64.S | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -267,13 +267,23 @@ BEGIN_FTR_SECTION END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r13,GPR13(r1) /* only restore r13 if returning to usermode */ + ld r2,GPR2(r1) + ld r1,GPR1(r1) + mtlr r4 + mtcr r5 + mtspr SPRN_SRR0,r7 + mtspr SPRN_SRR1,r8 + RFI_TO_USER + b . /* prevent speculative execution */ + + /* exit to kernel */ 1: ld r2,GPR2(r1) ld r1,GPR1(r1) mtlr r4 mtcr r5 mtspr SPRN_SRR0,r7 mtspr SPRN_SRR1,r8 - RFI + RFI_TO_KERNEL b . /* prevent speculative execution */ .Lsyscall_error: Patches currently in stable-queue which might be from npiggin(a)gmail.com are queue-4.14/powerpc-64s-simple-rfi-macro-conversions.patch queue-4.14/powerpc-64-add-macros-for-annotating-the-destination-of-rfid-hrfid.patch queue-4.14/powerpc-64s-convert-slb_miss_common-to-use-rfi_to_user-kernel.patch queue-4.14/powerpc-64-convert-the-syscall-exit-path-to-use-rfi_to_user-kernel.patch queue-4.14/powerpc-64s-add-support-for-rfi-flush-of-l1-d-cache.patch queue-4.14/powerpc-64-convert-fast_exception_return-to-use-rfi_to_user-kernel.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: powerpc-64-convert-fast_exception_return-to-use-rfi_to_user-kernel.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From a08f828cf47e6c605af21d2cdec68f84e799c318 Mon Sep 17 00:00:00 2001 From: Nicholas Piggin <npiggin(a)gmail.com> Date: Wed, 10 Jan 2018 03:07:15 +1100 Subject: powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL From: Nicholas Piggin <npiggin(a)gmail.com> commit a08f828cf47e6c605af21d2cdec68f84e799c318 upstream. Similar to the syscall return path, in fast_exception_return we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that. Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/powerpc/kernel/entry_64.S | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -892,7 +892,7 @@ BEGIN_FTR_SECTION END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ACCOUNT_CPU_USER_EXIT(r13, r2, r4) REST_GPR(13, r1) -1: + mtspr SPRN_SRR1,r3 ld r2,_CCR(r1) @@ -905,8 +905,22 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r3,GPR3(r1) ld r4,GPR4(r1) ld r1,GPR1(r1) + RFI_TO_USER + b . /* prevent speculative execution */ - rfid +1: mtspr SPRN_SRR1,r3 + + ld r2,_CCR(r1) + mtcrf 0xFF,r2 + ld r2,_NIP(r1) + mtspr SPRN_SRR0,r2 + + ld r0,GPR0(r1) + ld r2,GPR2(r1) + ld r3,GPR3(r1) + ld r4,GPR4(r1) + ld r1,GPR1(r1) + RFI_TO_KERNEL b . /* prevent speculative execution */ #endif /* CONFIG_PPC_BOOK3E */ Patches currently in stable-queue which might be from npiggin(a)gmail.com are queue-4.14/powerpc-64s-simple-rfi-macro-conversions.patch queue-4.14/powerpc-64-add-macros-for-annotating-the-destination-of-rfid-hrfid.patch queue-4.14/powerpc-64s-convert-slb_miss_common-to-use-rfi_to_user-kernel.patch queue-4.14/powerpc-64-convert-the-syscall-exit-path-to-use-rfi_to_user-kernel.patch queue-4.14/powerpc-64s-add-support-for-rfi-flush-of-l1-d-cache.patch queue-4.14/powerpc-64-convert-fast_exception_return-to-use-rfi_to_user-kernel.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: powerpc-pseries-add-h_get_cpu_characteristics-flags-wrapper.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 191eccb1580939fb0d47deb405b82a85b0379070 Mon Sep 17 00:00:00 2001 From: Michael Neuling <mikey(a)neuling.org> Date: Tue, 9 Jan 2018 03:52:05 +1100 Subject: powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper From: Michael Neuling <mikey(a)neuling.org> commit 191eccb1580939fb0d47deb405b82a85b0379070 upstream. A new hypervisor call has been defined to communicate various characteristics of the CPU to guests. Add definitions for the hcall number, flags and a wrapper function. Signed-off-by: Michael Neuling <mikey(a)neuling.org> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/powerpc/include/asm/hvcall.h | 17 +++++++++++++++++ arch/powerpc/include/asm/plpar_wrappers.h | 14 ++++++++++++++ 2 files changed, 31 insertions(+) --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -241,6 +241,7 @@ #define H_GET_HCA_INFO 0x1B8 #define H_GET_PERF_COUNT 0x1BC #define H_MANAGE_TRACE 0x1C0 +#define H_GET_CPU_CHARACTERISTICS 0x1C8 #define H_FREE_LOGICAL_LAN_BUFFER 0x1D4 #define H_QUERY_INT_STATE 0x1E4 #define H_POLL_PENDING 0x1D8 @@ -330,6 +331,17 @@ #define H_SIGNAL_SYS_RESET_ALL_OTHERS -2 /* >= 0 values are CPU number */ +/* H_GET_CPU_CHARACTERISTICS return values */ +#define H_CPU_CHAR_SPEC_BAR_ORI31 (1ull << 63) // IBM bit 0 +#define H_CPU_CHAR_BCCTRL_SERIALISED (1ull << 62) // IBM bit 1 +#define H_CPU_CHAR_L1D_FLUSH_ORI30 (1ull << 61) // IBM bit 2 +#define H_CPU_CHAR_L1D_FLUSH_TRIG2 (1ull << 60) // IBM bit 3 +#define H_CPU_CHAR_L1D_THREAD_PRIV (1ull << 59) // IBM bit 4 + +#define H_CPU_BEHAV_FAVOUR_SECURITY (1ull << 63) // IBM bit 0 +#define H_CPU_BEHAV_L1D_FLUSH_PR (1ull << 62) // IBM bit 1 +#define H_CPU_BEHAV_BNDS_CHK_SPEC_BAR (1ull << 61) // IBM bit 2 + /* Flag values used in H_REGISTER_PROC_TBL hcall */ #define PROC_TABLE_OP_MASK 0x18 #define PROC_TABLE_DEREG 0x10 @@ -436,6 +448,11 @@ static inline unsigned int get_longbusy_ } } +struct h_cpu_char_result { + u64 character; + u64 behaviour; +}; + #endif /* __ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_HVCALL_H */ --- a/arch/powerpc/include/asm/plpar_wrappers.h +++ b/arch/powerpc/include/asm/plpar_wrappers.h @@ -326,4 +326,18 @@ static inline long plapr_signal_sys_rese return plpar_hcall_norets(H_SIGNAL_SYS_RESET, cpu); } +static inline long plpar_get_cpu_characteristics(struct h_cpu_char_result *p) +{ + unsigned long retbuf[PLPAR_HCALL_BUFSIZE]; + long rc; + + rc = plpar_hcall(H_GET_CPU_CHARACTERISTICS, retbuf); + if (rc == H_SUCCESS) { + p->character = retbuf[0]; + p->behaviour = retbuf[1]; + } + + return rc; +} + #endif /* _ASM_POWERPC_PLPAR_WRAPPERS_H */ Patches currently in stable-queue which might be from mikey(a)neuling.org are queue-4.14/powerpc-pseries-add-h_get_cpu_characteristics-flags-wrapper.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "powerpc/64: Add macros for annotating the destination of rfid/hrfid" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled powerpc/64: Add macros for annotating the destination of rfid/hrfid to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: powerpc-64-add-macros-for-annotating-the-destination-of-rfid-hrfid.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 50e51c13b3822d14ff6df4279423e4b7b2269bc3 Mon Sep 17 00:00:00 2001 From: Nicholas Piggin <npiggin(a)gmail.com> Date: Wed, 10 Jan 2018 03:07:15 +1100 Subject: powerpc/64: Add macros for annotating the destination of rfid/hrfid From: Nicholas Piggin <npiggin(a)gmail.com> commit 50e51c13b3822d14ff6df4279423e4b7b2269bc3 upstream. The rfid/hrfid ((Hypervisor) Return From Interrupt) instruction is used for switching from the kernel to userspace, and from the hypervisor to the guest kernel. However it can and is also used for other transitions, eg. from real mode kernel code to virtual mode kernel code, and it's not always clear from the code what the destination context is. To make it clearer when reading the code, add macros which encode the expected destination context. Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/powerpc/include/asm/exception-64e.h | 6 ++++++ arch/powerpc/include/asm/exception-64s.h | 29 +++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+) --- a/arch/powerpc/include/asm/exception-64e.h +++ b/arch/powerpc/include/asm/exception-64e.h @@ -209,5 +209,11 @@ exc_##label##_book3e: ori r3,r3,vector_offset@l; \ mtspr SPRN_IVOR##vector_number,r3; +#define RFI_TO_KERNEL \ + rfi + +#define RFI_TO_USER \ + rfi + #endif /* _ASM_POWERPC_EXCEPTION_64E_H */ --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -69,6 +69,35 @@ */ #define EX_R3 EX_DAR +/* Macros for annotating the expected destination of (h)rfid */ + +#define RFI_TO_KERNEL \ + rfid + +#define RFI_TO_USER \ + rfid + +#define RFI_TO_USER_OR_KERNEL \ + rfid + +#define RFI_TO_GUEST \ + rfid + +#define HRFI_TO_KERNEL \ + hrfid + +#define HRFI_TO_USER \ + hrfid + +#define HRFI_TO_USER_OR_KERNEL \ + hrfid + +#define HRFI_TO_GUEST \ + hrfid + +#define HRFI_TO_UNKNOWN \ + hrfid + #ifdef CONFIG_RELOCATABLE #define __EXCEPTION_RELON_PROLOG_PSERIES_1(label, h) \ mfspr r11,SPRN_##h##SRR0; /* save SRR0 */ \ Patches currently in stable-queue which might be from npiggin(a)gmail.com are queue-4.14/powerpc-64-add-macros-for-annotating-the-destination-of-rfid-hrfid.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "objtool: Fix seg fault with clang-compiled objects" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled objtool: Fix seg fault with clang-compiled objects to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: objtool-fix-seg-fault-with-clang-compiled-objects.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From ce90aaf5cde4ce057b297bb6c955caf16ef00ee6 Mon Sep 17 00:00:00 2001 From: Simon Ser <contact(a)emersion.fr> Date: Sat, 30 Dec 2017 14:43:32 -0600 Subject: objtool: Fix seg fault with clang-compiled objects From: Simon Ser <contact(a)emersion.fr> commit ce90aaf5cde4ce057b297bb6c955caf16ef00ee6 upstream. Fix a seg fault which happens when an input file provided to 'objtool orc generate' doesn't have a '.shstrtab' section (for instance, object files produced by clang don't have this section). Signed-off-by: Simon Ser <contact(a)emersion.fr> Signed-off-by: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Link: http://lkml.kernel.org/r/c0f2231683e9bed40fac1f13ce2c33b8389854bc.151466645… Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Cc: Guenter Roeck <linux(a)roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- tools/objtool/orc_gen.c | 2 ++ 1 file changed, 2 insertions(+) --- a/tools/objtool/orc_gen.c +++ b/tools/objtool/orc_gen.c @@ -165,6 +165,8 @@ int create_orc_sections(struct objtool_f /* create .orc_unwind_ip and .rela.orc_unwind_ip sections */ sec = elf_create_section(file->elf, ".orc_unwind_ip", sizeof(int), idx); + if (!sec) + return -1; ip_relasec = elf_create_rela_section(file->elf, sec); if (!ip_relasec) Patches currently in stable-queue which might be from contact(a)emersion.fr are queue-4.14/objtool-fix-seg-fault-with-clang-compiled-objects.patch queue-4.14/objtool-fix-seg-fault-caused-by-missing-parameter.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "objtool: Fix Clang enum conversion warning" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled objtool: Fix Clang enum conversion warning to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: objtool-fix-clang-enum-conversion-warning.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From e7e83dd3ff1dd2f9e60213f6eedc7e5b08192062 Mon Sep 17 00:00:00 2001 From: Lukas Bulwahn <lukas.bulwahn(a)gmail.com> Date: Tue, 26 Dec 2017 15:27:20 -0600 Subject: objtool: Fix Clang enum conversion warning From: Lukas Bulwahn <lukas.bulwahn(a)gmail.com> commit e7e83dd3ff1dd2f9e60213f6eedc7e5b08192062 upstream. Fix the following Clang enum conversion warning: arch/x86/decode.c:141:20: error: implicit conversion from enumeration type 'enum op_src_type' to different enumeration type 'enum op_dest_type' [-Werror,-Wenum-conversion] op->dest.type = OP_SRC_REG; ~ ^~~~~~~~~~ It just happened to work before because OP_SRC_REG and OP_DEST_REG have the same value. Signed-off-by: Lukas Bulwahn <lukas.bulwahn(a)gmail.com> Signed-off-by: Josh Poimboeuf <jpoimboe(a)redhat.com> Reviewed-by: Nicholas Mc Guire <der.herr(a)hofr.at> Reviewed-by: Nick Desaulniers <nick.desaulniers(a)gmail.com> Cc: Jiri Slaby <jslaby(a)suse.cz> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Fixes: baa41469a7b9 ("objtool: Implement stack validation 2.0") Link: http://lkml.kernel.org/r/b4156c5738bae781c392e7a3691aed4514ebbdf2.151432356… Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Cc: Guenter Roeck <linux(a)roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- tools/objtool/arch/x86/decode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -138,7 +138,7 @@ int arch_decode_instruction(struct elf * *type = INSN_STACK; op->src.type = OP_SRC_ADD; op->src.reg = op_to_cfi_reg[modrm_reg][rex_r]; - op->dest.type = OP_SRC_REG; + op->dest.type = OP_DEST_REG; op->dest.reg = CFI_SP; } break; Patches currently in stable-queue which might be from lukas.bulwahn(a)gmail.com are queue-4.14/objtool-fix-clang-enum-conversion-warning.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "objtool: Fix seg fault caused by missing parameter" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled objtool: Fix seg fault caused by missing parameter to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: objtool-fix-seg-fault-caused-by-missing-parameter.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From d89e426499cf36b96161bd32970d6783f1fbcb0e Mon Sep 17 00:00:00 2001 From: Simon Ser <contact(a)emersion.fr> Date: Sat, 30 Dec 2017 14:43:31 -0600 Subject: objtool: Fix seg fault caused by missing parameter From: Simon Ser <contact(a)emersion.fr> commit d89e426499cf36b96161bd32970d6783f1fbcb0e upstream. Fix a seg fault when no parameter is provided to 'objtool orc'. Signed-off-by: Simon Ser <contact(a)emersion.fr> Signed-off-by: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Link: http://lkml.kernel.org/r/9172803ec7ebb72535bcd0b7f966ae96d515968e.151466645… Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Cc: Guenter Roeck <linux(a)roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- tools/objtool/builtin-orc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/tools/objtool/builtin-orc.c +++ b/tools/objtool/builtin-orc.c @@ -44,6 +44,9 @@ int cmd_orc(int argc, const char **argv) const char *objname; argc--; argv++; + if (argc <= 0) + usage_with_options(orc_usage, check_options); + if (!strncmp(argv[0], "gen", 3)) { argc = parse_options(argc, argv, check_options, orc_usage, 0); if (argc != 1) @@ -52,7 +55,6 @@ int cmd_orc(int argc, const char **argv) objname = argv[0]; return check(objname, no_fp, no_unreachable, true); - } if (!strcmp(argv[0], "dump")) { Patches currently in stable-queue which might be from contact(a)emersion.fr are queue-4.14/objtool-fix-seg-fault-with-clang-compiled-objects.patch queue-4.14/objtool-fix-seg-fault-caused-by-missing-parameter.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH v4.14 backport 01/10] powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper

by Michael Ellerman

From: Michael Neuling <mikey(a)neuling.org> commit 191eccb1580939fb0d47deb405b82a85b0379070 upstream. A new hypervisor call has been defined to communicate various characteristics of the CPU to guests. Add definitions for the hcall number, flags and a wrapper function. Signed-off-by: Michael Neuling <mikey(a)neuling.org> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> --- arch/powerpc/include/asm/hvcall.h | 17 +++++++++++++++++ arch/powerpc/include/asm/plpar_wrappers.h | 14 ++++++++++++++ 2 files changed, 31 insertions(+) diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index a409177be8bd..f0461618bf7b 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -241,6 +241,7 @@ #define H_GET_HCA_INFO 0x1B8 #define H_GET_PERF_COUNT 0x1BC #define H_MANAGE_TRACE 0x1C0 +#define H_GET_CPU_CHARACTERISTICS 0x1C8 #define H_FREE_LOGICAL_LAN_BUFFER 0x1D4 #define H_QUERY_INT_STATE 0x1E4 #define H_POLL_PENDING 0x1D8 @@ -330,6 +331,17 @@ #define H_SIGNAL_SYS_RESET_ALL_OTHERS -2 /* >= 0 values are CPU number */ +/* H_GET_CPU_CHARACTERISTICS return values */ +#define H_CPU_CHAR_SPEC_BAR_ORI31 (1ull << 63) // IBM bit 0 +#define H_CPU_CHAR_BCCTRL_SERIALISED (1ull << 62) // IBM bit 1 +#define H_CPU_CHAR_L1D_FLUSH_ORI30 (1ull << 61) // IBM bit 2 +#define H_CPU_CHAR_L1D_FLUSH_TRIG2 (1ull << 60) // IBM bit 3 +#define H_CPU_CHAR_L1D_THREAD_PRIV (1ull << 59) // IBM bit 4 + +#define H_CPU_BEHAV_FAVOUR_SECURITY (1ull << 63) // IBM bit 0 +#define H_CPU_BEHAV_L1D_FLUSH_PR (1ull << 62) // IBM bit 1 +#define H_CPU_BEHAV_BNDS_CHK_SPEC_BAR (1ull << 61) // IBM bit 2 + /* Flag values used in H_REGISTER_PROC_TBL hcall */ #define PROC_TABLE_OP_MASK 0x18 #define PROC_TABLE_DEREG 0x10 @@ -436,6 +448,11 @@ static inline unsigned int get_longbusy_msecs(int longbusy_rc) } } +struct h_cpu_char_result { + u64 character; + u64 behaviour; +}; + #endif /* __ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_HVCALL_H */ diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h index 7f01b22fa6cb..55eddf50d149 100644 --- a/arch/powerpc/include/asm/plpar_wrappers.h +++ b/arch/powerpc/include/asm/plpar_wrappers.h @@ -326,4 +326,18 @@ static inline long plapr_signal_sys_reset(long cpu) return plpar_hcall_norets(H_SIGNAL_SYS_RESET, cpu); } +static inline long plpar_get_cpu_characteristics(struct h_cpu_char_result *p) +{ + unsigned long retbuf[PLPAR_HCALL_BUFSIZE]; + long rc; + + rc = plpar_hcall(H_GET_CPU_CHARACTERISTICS, retbuf); + if (rc == H_SUCCESS) { + p->character = retbuf[0]; + p->behaviour = retbuf[1]; + } + + return rc; +} + #endif /* _ASM_POWERPC_PLPAR_WRAPPERS_H */ -- 2.14.3

7 years, 8 months

1
9
0 0

[Linux-stable-mirror] [PATCH 2/2] can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once

by Marc Kleine-Budde

If an invalid CANFD frame is received, from a driver or from a tun interface, a Kernel warning is generated. This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a kernel, bootet with panic_on_warn, does not panic. A printk seems to be more appropriate here. Reported-by: syzbot+e3b775f40babeff6e68b(a)syzkaller.appspotmail.com Suggested-by: Dmitry Vyukov <dvyukov(a)google.com> Acked-by: Oliver Hartkopp <socketcan(a)hartkopp.net> Cc: linux-stable <stable(a)vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de> --- net/can/af_can.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/net/can/af_can.c b/net/can/af_can.c index ae835382e678..4d7f988a3130 100644 --- a/net/can/af_can.c +++ b/net/can/af_can.c @@ -738,20 +738,16 @@ static int canfd_rcv(struct sk_buff *skb, struct net_device *dev, { struct canfd_frame *cfd = (struct canfd_frame *)skb->data; - if (WARN_ONCE(dev->type != ARPHRD_CAN || - skb->len != CANFD_MTU || - cfd->len > CANFD_MAX_DLEN, - "PF_CAN: dropped non conform CAN FD skbuf: " - "dev type %d, len %d, datalen %d\n", - dev->type, skb->len, cfd->len)) - goto drop; + if (unlikely(dev->type != ARPHRD_CAN || skb->len != CANFD_MTU || + cfd->len > CANFD_MAX_DLEN)) { + pr_warn_once("PF_CAN: dropped non conform CAN FD skbuf: dev type %d, len %d, datalen %d\n", + dev->type, skb->len, cfd->len); + kfree_skb(skb); + return NET_RX_DROP; + } can_receive(skb, dev); return NET_RX_SUCCESS; - -drop: - kfree_skb(skb); - return NET_RX_DROP; } /* -- 2.15.1

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH 1/2] can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once

by Marc Kleine-Budde

If an invalid CAN frame is received, from a driver or from a tun interface, a Kernel warning is generated. This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a kernel, bootet with panic_on_warn, does not panic. A printk seems to be more appropriate here. Reported-by: syzbot+4386709c0c1284dca827(a)syzkaller.appspotmail.com Suggested-by: Dmitry Vyukov <dvyukov(a)google.com> Acked-by: Oliver Hartkopp <socketcan(a)hartkopp.net> Cc: linux-stable <stable(a)vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de> --- net/can/af_can.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/net/can/af_can.c b/net/can/af_can.c index 003b2d6d655f..ae835382e678 100644 --- a/net/can/af_can.c +++ b/net/can/af_can.c @@ -721,20 +721,16 @@ static int can_rcv(struct sk_buff *skb, struct net_device *dev, { struct canfd_frame *cfd = (struct canfd_frame *)skb->data; - if (WARN_ONCE(dev->type != ARPHRD_CAN || - skb->len != CAN_MTU || - cfd->len > CAN_MAX_DLEN, - "PF_CAN: dropped non conform CAN skbuf: " - "dev type %d, len %d, datalen %d\n", - dev->type, skb->len, cfd->len)) - goto drop; + if (unlikely(dev->type != ARPHRD_CAN || skb->len != CAN_MTU || + cfd->len > CAN_MAX_DLEN)) { + pr_warn_once("PF_CAN: dropped non conform CAN skbuf: dev type %d, len %d, datalen %d\n", + dev->type, skb->len, cfd->len); + kfree_skb(skb); + return NET_RX_DROP; + } can_receive(skb, dev); return NET_RX_SUCCESS; - -drop: - kfree_skb(skb); - return NET_RX_DROP; } static int canfd_rcv(struct sk_buff *skb, struct net_device *dev, -- 2.15.1

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "scsi: sg: disable SET_FORCE_LOW_DMA" has been added to the 4.9-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled scsi: sg: disable SET_FORCE_LOW_DMA to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: scsi-sg-disable-set_force_low_dma.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 745dfa0d8ec26b24f3304459ff6e9eacc5c8351b Mon Sep 17 00:00:00 2001 From: Hannes Reinecke <hare(a)suse.de> Date: Fri, 7 Apr 2017 09:34:12 +0200 Subject: scsi: sg: disable SET_FORCE_LOW_DMA From: Hannes Reinecke <hare(a)suse.de> commit 745dfa0d8ec26b24f3304459ff6e9eacc5c8351b upstream. The ioctl SET_FORCE_LOW_DMA has never worked since the initial git check-in, and the respective setting is nowadays handled correctly. So disable it entirely. Signed-off-by: Hannes Reinecke <hare(a)suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn(a)suse.de> Tested-by: Johannes Thumshirn <jthumshirn(a)suse.de> Reviewed-by: Christoph Hellwig <hch(a)lst.de> Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/scsi/sg.c | 30 +++++++++--------------------- include/scsi/sg.h | 1 - 2 files changed, 9 insertions(+), 22 deletions(-) --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -149,7 +149,6 @@ typedef struct sg_fd { /* holds the sta struct list_head rq_list; /* head of request list */ struct fasync_struct *async_qp; /* used by asynchronous notification */ Sg_request req_arr[SG_MAX_QUEUE]; /* used as singly-linked list */ - char low_dma; /* as in parent but possibly overridden to 1 */ char force_packid; /* 1 -> pack_id input to read(), 0 -> ignored */ char cmd_q; /* 1 -> allow command queuing, 0 -> don't */ unsigned char next_cmd_len; /* 0: automatic, >0: use on next write() */ @@ -922,24 +921,14 @@ sg_ioctl(struct file *filp, unsigned int /* strange ..., for backward compatibility */ return sfp->timeout_user; case SG_SET_FORCE_LOW_DMA: - result = get_user(val, ip); - if (result) - return result; - if (val) { - sfp->low_dma = 1; - if ((0 == sfp->low_dma) && !sfp->res_in_use) { - val = (int) sfp->reserve.bufflen; - sg_remove_scat(sfp, &sfp->reserve); - sg_build_reserve(sfp, val); - } - } else { - if (atomic_read(&sdp->detaching)) - return -ENODEV; - sfp->low_dma = sdp->device->host->unchecked_isa_dma; - } + /* + * N.B. This ioctl never worked properly, but failed to + * return an error value. So returning '0' to keep compability + * with legacy applications. + */ return 0; case SG_GET_LOW_DMA: - return put_user((int) sfp->low_dma, ip); + return put_user((int) sdp->device->host->unchecked_isa_dma, ip); case SG_GET_SCSI_ID: if (!access_ok(VERIFY_WRITE, p, sizeof (sg_scsi_id_t))) return -EFAULT; @@ -1860,6 +1849,7 @@ sg_build_indirect(Sg_scatter_hold * schp int sg_tablesize = sfp->parentdp->sg_tablesize; int blk_size = buff_size, order; gfp_t gfp_mask = GFP_ATOMIC | __GFP_COMP | __GFP_NOWARN; + struct sg_device *sdp = sfp->parentdp; if (blk_size < 0) return -EFAULT; @@ -1885,7 +1875,7 @@ sg_build_indirect(Sg_scatter_hold * schp scatter_elem_sz_prev = num; } - if (sfp->low_dma) + if (sdp->device->host->unchecked_isa_dma) gfp_mask |= GFP_DMA; if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO)) @@ -2148,8 +2138,6 @@ sg_add_sfp(Sg_device * sdp) sfp->timeout = SG_DEFAULT_TIMEOUT; sfp->timeout_user = SG_DEFAULT_TIMEOUT_USER; sfp->force_packid = SG_DEF_FORCE_PACK_ID; - sfp->low_dma = (SG_DEF_FORCE_LOW_DMA == 0) ? - sdp->device->host->unchecked_isa_dma : 1; sfp->cmd_q = SG_DEF_COMMAND_Q; sfp->keep_orphan = SG_DEF_KEEP_ORPHAN; sfp->parentdp = sdp; @@ -2608,7 +2596,7 @@ static void sg_proc_debug_helper(struct jiffies_to_msecs(fp->timeout), fp->reserve.bufflen, (int) fp->reserve.k_use_sg, - (int) fp->low_dma); + (int) sdp->device->host->unchecked_isa_dma); seq_printf(s, " cmd_q=%d f_packid=%d k_orphan=%d closed=0\n", (int) fp->cmd_q, (int) fp->force_packid, (int) fp->keep_orphan); --- a/include/scsi/sg.h +++ b/include/scsi/sg.h @@ -197,7 +197,6 @@ typedef struct sg_req_info { /* used by #define SG_DEFAULT_RETRIES 0 /* Defaults, commented if they differ from original sg driver */ -#define SG_DEF_FORCE_LOW_DMA 0 /* was 1 -> memory below 16MB on i386 */ #define SG_DEF_FORCE_PACK_ID 0 #define SG_DEF_KEEP_ORPHAN 0 #define SG_DEF_RESERVED_SIZE SG_SCATTER_SZ /* load time option */ Patches currently in stable-queue which might be from hare(a)suse.de are queue-4.9/scsi-sg-disable-set_force_low_dma.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "scsi: sg: disable SET_FORCE_LOW_DMA" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled scsi: sg: disable SET_FORCE_LOW_DMA to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: scsi-sg-disable-set_force_low_dma.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 745dfa0d8ec26b24f3304459ff6e9eacc5c8351b Mon Sep 17 00:00:00 2001 From: Hannes Reinecke <hare(a)suse.de> Date: Fri, 7 Apr 2017 09:34:12 +0200 Subject: scsi: sg: disable SET_FORCE_LOW_DMA From: Hannes Reinecke <hare(a)suse.de> commit 745dfa0d8ec26b24f3304459ff6e9eacc5c8351b upstream. The ioctl SET_FORCE_LOW_DMA has never worked since the initial git check-in, and the respective setting is nowadays handled correctly. So disable it entirely. Signed-off-by: Hannes Reinecke <hare(a)suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn(a)suse.de> Tested-by: Johannes Thumshirn <jthumshirn(a)suse.de> Reviewed-by: Christoph Hellwig <hch(a)lst.de> Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/scsi/sg.c | 30 +++++++++--------------------- include/scsi/sg.h | 1 - 2 files changed, 9 insertions(+), 22 deletions(-) --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -160,7 +160,6 @@ typedef struct sg_fd { /* holds the sta struct list_head rq_list; /* head of request list */ struct fasync_struct *async_qp; /* used by asynchronous notification */ Sg_request req_arr[SG_MAX_QUEUE]; /* used as singly-linked list */ - char low_dma; /* as in parent but possibly overridden to 1 */ char force_packid; /* 1 -> pack_id input to read(), 0 -> ignored */ char cmd_q; /* 1 -> allow command queuing, 0 -> don't */ unsigned char next_cmd_len; /* 0: automatic, >0: use on next write() */ @@ -932,24 +931,14 @@ sg_ioctl(struct file *filp, unsigned int /* strange ..., for backward compatibility */ return sfp->timeout_user; case SG_SET_FORCE_LOW_DMA: - result = get_user(val, ip); - if (result) - return result; - if (val) { - sfp->low_dma = 1; - if ((0 == sfp->low_dma) && !sfp->res_in_use) { - val = (int) sfp->reserve.bufflen; - sg_remove_scat(sfp, &sfp->reserve); - sg_build_reserve(sfp, val); - } - } else { - if (atomic_read(&sdp->detaching)) - return -ENODEV; - sfp->low_dma = sdp->device->host->unchecked_isa_dma; - } + /* + * N.B. This ioctl never worked properly, but failed to + * return an error value. So returning '0' to keep compability + * with legacy applications. + */ return 0; case SG_GET_LOW_DMA: - return put_user((int) sfp->low_dma, ip); + return put_user((int) sdp->device->host->unchecked_isa_dma, ip); case SG_GET_SCSI_ID: if (!access_ok(VERIFY_WRITE, p, sizeof (sg_scsi_id_t))) return -EFAULT; @@ -1870,6 +1859,7 @@ sg_build_indirect(Sg_scatter_hold * schp int sg_tablesize = sfp->parentdp->sg_tablesize; int blk_size = buff_size, order; gfp_t gfp_mask = GFP_ATOMIC | __GFP_COMP | __GFP_NOWARN; + struct sg_device *sdp = sfp->parentdp; if (blk_size < 0) return -EFAULT; @@ -1895,7 +1885,7 @@ sg_build_indirect(Sg_scatter_hold * schp scatter_elem_sz_prev = num; } - if (sfp->low_dma) + if (sdp->device->host->unchecked_isa_dma) gfp_mask |= GFP_DMA; if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO)) @@ -2158,8 +2148,6 @@ sg_add_sfp(Sg_device * sdp) sfp->timeout = SG_DEFAULT_TIMEOUT; sfp->timeout_user = SG_DEFAULT_TIMEOUT_USER; sfp->force_packid = SG_DEF_FORCE_PACK_ID; - sfp->low_dma = (SG_DEF_FORCE_LOW_DMA == 0) ? - sdp->device->host->unchecked_isa_dma : 1; sfp->cmd_q = SG_DEF_COMMAND_Q; sfp->keep_orphan = SG_DEF_KEEP_ORPHAN; sfp->parentdp = sdp; @@ -2618,7 +2606,7 @@ static void sg_proc_debug_helper(struct jiffies_to_msecs(fp->timeout), fp->reserve.bufflen, (int) fp->reserve.k_use_sg, - (int) fp->low_dma); + (int) sdp->device->host->unchecked_isa_dma); seq_printf(s, " cmd_q=%d f_packid=%d k_orphan=%d closed=0\n", (int) fp->cmd_q, (int) fp->force_packid, (int) fp->keep_orphan); --- a/include/scsi/sg.h +++ b/include/scsi/sg.h @@ -197,7 +197,6 @@ typedef struct sg_req_info { /* used by #define SG_DEFAULT_RETRIES 0 /* Defaults, commented if they differ from original sg driver */ -#define SG_DEF_FORCE_LOW_DMA 0 /* was 1 -> memory below 16MB on i386 */ #define SG_DEF_FORCE_PACK_ID 0 #define SG_DEF_KEEP_ORPHAN 0 #define SG_DEF_RESERVED_SIZE SG_SCATTER_SZ /* load time option */ Patches currently in stable-queue which might be from hare(a)suse.de are queue-4.4/scsi-sg-disable-set_force_low_dma.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "scsi: sg: disable SET_FORCE_LOW_DMA" has been added to the 3.18-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled scsi: sg: disable SET_FORCE_LOW_DMA to the 3.18-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: scsi-sg-disable-set_force_low_dma.patch and it can be found in the queue-3.18 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 745dfa0d8ec26b24f3304459ff6e9eacc5c8351b Mon Sep 17 00:00:00 2001 From: Hannes Reinecke <hare(a)suse.de> Date: Fri, 7 Apr 2017 09:34:12 +0200 Subject: scsi: sg: disable SET_FORCE_LOW_DMA From: Hannes Reinecke <hare(a)suse.de> commit 745dfa0d8ec26b24f3304459ff6e9eacc5c8351b upstream. The ioctl SET_FORCE_LOW_DMA has never worked since the initial git check-in, and the respective setting is nowadays handled correctly. So disable it entirely. Signed-off-by: Hannes Reinecke <hare(a)suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn(a)suse.de> Tested-by: Johannes Thumshirn <jthumshirn(a)suse.de> Reviewed-by: Christoph Hellwig <hch(a)lst.de> Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/scsi/sg.c | 30 +++++++++--------------------- include/scsi/sg.h | 1 - 2 files changed, 9 insertions(+), 22 deletions(-) --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -160,7 +160,6 @@ typedef struct sg_fd { /* holds the sta struct list_head rq_list; /* head of request list */ struct fasync_struct *async_qp; /* used by asynchronous notification */ Sg_request req_arr[SG_MAX_QUEUE]; /* used as singly-linked list */ - char low_dma; /* as in parent but possibly overridden to 1 */ char force_packid; /* 1 -> pack_id input to read(), 0 -> ignored */ char cmd_q; /* 1 -> allow command queuing, 0 -> don't */ unsigned char next_cmd_len; /* 0: automatic, >0: use on next write() */ @@ -947,24 +946,14 @@ sg_ioctl(struct file *filp, unsigned int /* strange ..., for backward compatibility */ return sfp->timeout_user; case SG_SET_FORCE_LOW_DMA: - result = get_user(val, ip); - if (result) - return result; - if (val) { - sfp->low_dma = 1; - if ((0 == sfp->low_dma) && !sfp->res_in_use) { - val = (int) sfp->reserve.bufflen; - sg_remove_scat(sfp, &sfp->reserve); - sg_build_reserve(sfp, val); - } - } else { - if (atomic_read(&sdp->detaching)) - return -ENODEV; - sfp->low_dma = sdp->device->host->unchecked_isa_dma; - } + /* + * N.B. This ioctl never worked properly, but failed to + * return an error value. So returning '0' to keep compability + * with legacy applications. + */ return 0; case SG_GET_LOW_DMA: - return put_user((int) sfp->low_dma, ip); + return put_user((int) sdp->device->host->unchecked_isa_dma, ip); case SG_GET_SCSI_ID: if (!access_ok(VERIFY_WRITE, p, sizeof (sg_scsi_id_t))) return -EFAULT; @@ -1916,6 +1905,7 @@ sg_build_indirect(Sg_scatter_hold * schp int sg_tablesize = sfp->parentdp->sg_tablesize; int blk_size = buff_size, order; gfp_t gfp_mask = GFP_ATOMIC | __GFP_COMP | __GFP_NOWARN; + struct sg_device *sdp = sfp->parentdp; if (blk_size < 0) return -EFAULT; @@ -1941,7 +1931,7 @@ sg_build_indirect(Sg_scatter_hold * schp scatter_elem_sz_prev = num; } - if (sfp->low_dma) + if (sdp->device->host->unchecked_isa_dma) gfp_mask |= GFP_DMA; if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO)) @@ -2204,8 +2194,6 @@ sg_add_sfp(Sg_device * sdp) sfp->timeout = SG_DEFAULT_TIMEOUT; sfp->timeout_user = SG_DEFAULT_TIMEOUT_USER; sfp->force_packid = SG_DEF_FORCE_PACK_ID; - sfp->low_dma = (SG_DEF_FORCE_LOW_DMA == 0) ? - sdp->device->host->unchecked_isa_dma : 1; sfp->cmd_q = SG_DEF_COMMAND_Q; sfp->keep_orphan = SG_DEF_KEEP_ORPHAN; sfp->parentdp = sdp; @@ -2664,7 +2652,7 @@ static void sg_proc_debug_helper(struct jiffies_to_msecs(fp->timeout), fp->reserve.bufflen, (int) fp->reserve.k_use_sg, - (int) fp->low_dma); + (int) sdp->device->host->unchecked_isa_dma); seq_printf(s, " cmd_q=%d f_packid=%d k_orphan=%d closed=0\n", (int) fp->cmd_q, (int) fp->force_packid, (int) fp->keep_orphan); --- a/include/scsi/sg.h +++ b/include/scsi/sg.h @@ -194,7 +194,6 @@ typedef struct sg_req_info { /* used by #define SG_DEFAULT_RETRIES 0 /* Defaults, commented if they differ from original sg driver */ -#define SG_DEF_FORCE_LOW_DMA 0 /* was 1 -> memory below 16MB on i386 */ #define SG_DEF_FORCE_PACK_ID 0 #define SG_DEF_KEEP_ORPHAN 0 #define SG_DEF_RESERVED_SIZE SG_SCATTER_SZ /* load time option */ Patches currently in stable-queue which might be from hare(a)suse.de are queue-3.18/scsi-sg-disable-set_force_low_dma.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "tools/objtool/Makefile: don't assume sync-check.sh is executable" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled tools/objtool/Makefile: don't assume sync-check.sh is executable to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: tools-objtool-makefile-don-t-assume-sync-check.sh-is-executable.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 0f908ccbeca99ddf0ad60afa710e72aded4a5ea7 Mon Sep 17 00:00:00 2001 From: Andrew Morton <akpm(a)linux-foundation.org> Date: Fri, 12 Jan 2018 16:53:17 -0800 Subject: tools/objtool/Makefile: don't assume sync-check.sh is executable From: Andrew Morton <akpm(a)linux-foundation.org> commit 0f908ccbeca99ddf0ad60afa710e72aded4a5ea7 upstream. patch(1) loses the x bit. So if a user follows our patching instructions in Documentation/admin-guide/README.rst, their kernel will not compile. Fixes: 3bd51c5a371de ("objtool: Move kernel headers/code sync check to a script") Reported-by: Nicolas Bock <nicolasbock(a)gentoo.org> Reported-by Joakim Tjernlund <Joakim.Tjernlund(a)infinera.com> Cc: Ingo Molnar <mingo(a)kernel.org> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Holger Hoffstätte <holger(a)applied-asynchrony.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- tools/objtool/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/tools/objtool/Makefile +++ b/tools/objtool/Makefile @@ -46,7 +46,7 @@ $(OBJTOOL_IN): fixdep FORCE @$(MAKE) $(build)=objtool $(OBJTOOL): $(LIBSUBCMD) $(OBJTOOL_IN) - @./sync-check.sh + @$(CONFIG_SHELL) ./sync-check.sh $(QUIET_LINK)$(CC) $(OBJTOOL_IN) $(LDFLAGS) -o $@ Patches currently in stable-queue which might be from akpm(a)linux-foundation.org are queue-4.14/tools-objtool-makefile-don-t-assume-sync-check.sh-is-executable.patch

7 years, 8 months

4
4
0 0

[Linux-stable-mirror] [PATCH] USB: serial: simple: add Motorola Tetra driver

by Johan Hovold

Add new Motorola Tetra (simple) driver for Motorola Solutions TETRA PEI devices. D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=0cad ProdID=9011 Rev=24.16 S: Manufacturer=Motorola Solutions Inc. S: Product=Motorola Solutions TETRA PEI interface C: #Ifs= 2 Cfg#= 1 Atr=80 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) Note that these devices do not support the CDC SET_CONTROL_LINE_STATE request (for any interface). Reported-by: Max Schulze <max.schulze(a)posteo.de> Tested-by: Max Schulze <max.schulze(a)posteo.de> Cc: stable <stable(a)vger.kernel.org> Signed-off-by: Johan Hovold <johan(a)kernel.org> --- Greg, if you want you can pick this one up for 4.16 directly as well. Thanks, Johan drivers/usb/serial/Kconfig | 1 + drivers/usb/serial/usb-serial-simple.c | 7 +++++++ 2 files changed, 8 insertions(+) diff --git a/drivers/usb/serial/Kconfig b/drivers/usb/serial/Kconfig index a8d5f2e4878d..c66b93664d54 100644 --- a/drivers/usb/serial/Kconfig +++ b/drivers/usb/serial/Kconfig @@ -63,6 +63,7 @@ config USB_SERIAL_SIMPLE - Google USB serial devices - HP4x calculators - a number of Motorola phones + - Motorola Tetra devices - Novatel Wireless GPS receivers - Siemens USB/MPI adapter. - ViVOtech ViVOpay USB device. diff --git a/drivers/usb/serial/usb-serial-simple.c b/drivers/usb/serial/usb-serial-simple.c index 74172fe158df..4ef79e29cb26 100644 --- a/drivers/usb/serial/usb-serial-simple.c +++ b/drivers/usb/serial/usb-serial-simple.c @@ -77,6 +77,11 @@ DEVICE(vivopay, VIVOPAY_IDS); { USB_DEVICE(0x22b8, 0x2c64) } /* Motorola V950 phone */ DEVICE(moto_modem, MOTO_IDS); +/* Motorola Tetra driver */ +#define MOTOROLA_TETRA_IDS() \ + { USB_DEVICE(0x0cad, 0x9011) } /* Motorola Solutions TETRA PEI */ +DEVICE(motorola_tetra, MOTOROLA_TETRA_IDS); + /* Novatel Wireless GPS driver */ #define NOVATEL_IDS() \ { USB_DEVICE(0x09d7, 0x0100) } /* NovAtel FlexPack GPS */ @@ -107,6 +112,7 @@ static struct usb_serial_driver * const serial_drivers[] = { &google_device, &vivopay_device, &moto_modem_device, + &motorola_tetra_device, &novatel_gps_device, &hp4x_device, &suunto_device, @@ -122,6 +128,7 @@ static const struct usb_device_id id_table[] = { GOOGLE_IDS(), VIVOPAY_IDS(), MOTO_IDS(), + MOTOROLA_TETRA_IDS(), NOVATEL_IDS(), HP4X_IDS(), SUUNTO_IDS(), -- 2.13.6

7 years, 8 months

1
0
0 0

Re: [Linux-stable-mirror] [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing

by Joseph Salisbury

On 01/16/2018 08:32 AM, Shankar, Ravi V wrote: > Vikas on vacation until end of the month. Fenghua will look into this > issue. > > On Jan 16, 2018, at 5:09 AM, Thomas Gleixner <tglx(a)linutronix.de > <mailto:tglx@linutronix.de>> wrote: > >> >> Vikas, Fenghua can you please look at that ASAP? >> >> On Sun, 14 Jan 2018, Thomas Gleixner wrote: >> >>> On Fri, 12 Jan 2018, Joseph Salisbury wrote: >>> >>>> Hi Vikas, >>>> >>>> A kernel bug report was opened against Ubuntu [0]. After a kernel >>>> bisect, it was found that reverting the following commit resolved >>>> this bug: >>>> >>>> commit 24247aeeabe99eab13b798ccccc2dec066dd6f07 >>>> Author: Vikas Shivappa <vikas.shivappa(a)linux.intel.com >>>> <mailto:vikas.shivappa@linux.intel.com>> >>>> Date: Tue Aug 15 18:00:43 2017 -0700 >>>> >>>> x86/intel_rdt/cqm: Improve limbo list processing >>>> >>>> >>>> The regression was introduced as of v4.14-r1 and still exists with >>>> current mainline. The trace with v4.15-rc7 is in comment #44[1]. >>>> >>>> I was hoping to get your feedback, since you are the patch author. Do >>>> you think gathering any additional data will help diagnose this issue, >>>> or would it be best to submit a revert request? >>> >>> That stinks like a use after free. Can you run with KASAN enabled? >>> >>> Thanks, >>> >>> tglx Here is some data wiht KASAN enabled: https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1733662/comments/51 Are there any specific logs you would like to see, or specific actions executed? Thanks, Joe

7 years, 8 months

3
7
0 0

[Linux-stable-mirror] [PATCH] ubi: fastmap: Don't flush fastmap work on detach

by Richard Weinberger

At this point UBI volumes have already been free()'ed and fastmap can no longer access these data structures. Reported-by: Martin Townsend <mtownsend1973(a)gmail.com> Fixes: 74cdaf24004a ("UBI: Fastmap: Fix memory leaks while closing the WL sub-system") Cc: stable(a)vger.kernel.org Signed-off-by: Richard Weinberger <richard(a)nod.at> --- drivers/mtd/ubi/fastmap-wl.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c index 4f0bd6b4422a..69dd21679a30 100644 --- a/drivers/mtd/ubi/fastmap-wl.c +++ b/drivers/mtd/ubi/fastmap-wl.c @@ -362,7 +362,6 @@ static void ubi_fastmap_close(struct ubi_device *ubi) { int i; - flush_work(&ubi->fm_work); return_unused_pool_pebs(ubi, &ubi->fm_pool); return_unused_pool_pebs(ubi, &ubi->fm_wl_pool); -- 2.13.6

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH] drm/vmwgfx: fix memory corruption with legacy/sou connectors

by Rob Clark

From: Rob Clark <rclark(a)redhat.com> It looks like in all cases 'struct vmw_connector_state' is used. But only in stdu connectors, was atomic_{duplicate,destroy}_state() properly subclassed. Leading to writes beyond the end of the allocated connector state block and all sorts of fun memory corruption related crashes. Fixes: d7721ca71126 "drm/vmwgfx: Connector atomic state" Cc: <stable(a)vger.kernel.org> Signed-off-by: Rob Clark <rclark(a)redhat.com> --- drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c | 4 ++-- drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c index b8a09807c5de..3824595fece1 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c @@ -266,8 +266,8 @@ static const struct drm_connector_funcs vmw_legacy_connector_funcs = { .set_property = vmw_du_connector_set_property, .destroy = vmw_ldu_connector_destroy, .reset = vmw_du_connector_reset, - .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state, - .atomic_destroy_state = drm_atomic_helper_connector_destroy_state, + .atomic_duplicate_state = vmw_du_connector_duplicate_state, + .atomic_destroy_state = vmw_du_connector_destroy_state, .atomic_set_property = vmw_du_connector_atomic_set_property, .atomic_get_property = vmw_du_connector_atomic_get_property, }; diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c index bc5f6026573d..63a4cd794b73 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c @@ -420,8 +420,8 @@ static const struct drm_connector_funcs vmw_sou_connector_funcs = { .set_property = vmw_du_connector_set_property, .destroy = vmw_sou_connector_destroy, .reset = vmw_du_connector_reset, - .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state, - .atomic_destroy_state = drm_atomic_helper_connector_destroy_state, + .atomic_duplicate_state = vmw_du_connector_duplicate_state, + .atomic_destroy_state = vmw_du_connector_destroy_state, .atomic_set_property = vmw_du_connector_atomic_set_property, .atomic_get_property = vmw_du_connector_atomic_get_property, }; -- 2.14.3

7 years, 8 months

3
2
0 0

Re: [Linux-stable-mirror] [PATCH] drm/vc4: Fix NULL pointer dereference in vc4_save_hang_state()

by Daniel Vetter

On Wed, Jan 17, 2018 at 12:03:15PM -0800, Eric Anholt wrote: > Boris Brezillon <boris.brezillon(a)free-electrons.com> writes: > > > When saving BOs in the hang state we skip one entry of the > > kernel_state->bo[] array, thus leaving it to NULL. This leads to a NULL > > pointer dereference when, later in this function, we iterate over all > > BOs to check their ->madv state. > > > > Fixes: ca26d28bbaa3 ("drm/vc4: improve throughput by pipelining binning and rendering jobs") > > Cc: <stable(a)vger.kernel.org> > > Signed-off-by: Boris Brezillon <boris.brezillon(a)free-electrons.com> > > --- > > drivers/gpu/drm/vc4/vc4_gem.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c > > index 6c32c89a83a9..19ac7fe0e5db 100644 > > --- a/drivers/gpu/drm/vc4/vc4_gem.c > > +++ b/drivers/gpu/drm/vc4/vc4_gem.c > > @@ -208,7 +208,7 @@ vc4_save_hang_state(struct drm_device *dev) > > kernel_state->bo[j + prev_idx] = &bo->base.base; > > j++; > > } > > - prev_idx = j + 1; > > + prev_idx = j; > > Could we replace the whole "[j + prev_idx]" with a "[k++]" and maybe a > WARN_ON_ONCE(k != state->bo_count) at the end? > > I really need to figure out if I can come up with a way to make IGT > cases for GPU hangs on vc4, despite the validation. I found a bug in > GPU reset due to BCL hangs when doing vc5, but I don't have a testcase. > Maybe some submit flags that overwrite the BCL or RCL to do an infinite > loop? What we currently do for i915 is an endless chain of batches (since no command parser we can get away with that). Previously we did a special debugfs mode which blocked out updating the ring head (but left all the other command submission handling in place). Except for the very minor change nothing needed to be adjusted in the kernel, and from the kernel's pov it very much looked like the gpu simply died. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] v4.9.77 build: 0 failures 0 warnings (v4.9.77)

by Build bot for Mark Brown

Tree/Branch: v4.9.77 Git describe: v4.9.77 Commit: b8cf9ff79d Linux 4.9.77 Build Time: 92 min 18 sec Passed: 10 / 10 (100.00 %) Failed: 0 / 10 ( 0.00 %) Errors: 0 Warnings: 0 Section Mismatches: 0 ------------------------------------------------------------------------------- defconfigs with issues (other than build errors): ------------------------------------------------------------------------------- =============================================================================== Detailed per-defconfig build reports below: ------------------------------------------------------------------------------- Passed with no errors, warnings or mismatches: arm64-allnoconfig arm64-allmodconfig arm-multi_v5_defconfig arm-multi_v7_defconfig x86_64-defconfig arm-allmodconfig arm-allnoconfig x86_64-allnoconfig arm-multi_v4t_defconfig arm64-defconfig close failed in file object destructor: sys.excepthook is missing lost sys.stderr

7 years, 8 months

1
0
0 0

Re: [Linux-stable-mirror] [PATCH] drm/vc4: Fix NULL pointer dereference in vc4_save_hang_state()

by Boris Brezillon

On Wed, 17 Jan 2018 12:03:15 -0800 Eric Anholt <eric(a)anholt.net> wrote: > Boris Brezillon <boris.brezillon(a)free-electrons.com> writes: > > > When saving BOs in the hang state we skip one entry of the > > kernel_state->bo[] array, thus leaving it to NULL. This leads to a NULL > > pointer dereference when, later in this function, we iterate over all > > BOs to check their ->madv state. > > > > Fixes: ca26d28bbaa3 ("drm/vc4: improve throughput by pipelining binning and rendering jobs") > > Cc: <stable(a)vger.kernel.org> > > Signed-off-by: Boris Brezillon <boris.brezillon(a)free-electrons.com> > > --- > > drivers/gpu/drm/vc4/vc4_gem.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c > > index 6c32c89a83a9..19ac7fe0e5db 100644 > > --- a/drivers/gpu/drm/vc4/vc4_gem.c > > +++ b/drivers/gpu/drm/vc4/vc4_gem.c > > @@ -208,7 +208,7 @@ vc4_save_hang_state(struct drm_device *dev) > > kernel_state->bo[j + prev_idx] = &bo->base.base; > > j++; > > } > > - prev_idx = j + 1; > > + prev_idx = j; > > Could we replace the whole "[j + prev_idx]" with a "[k++]" and maybe a > WARN_ON_ONCE(k != state->bo_count) at the end? Sure. > > I really need to figure out if I can come up with a way to make IGT > cases for GPU hangs on vc4, despite the validation. I managed to trigger the NULL pointer dereference while debugging the perfmon stuff, but it's fixed now, so I don't have a way to easily force a reset. > I found a bug in > GPU reset due to BCL hangs when doing vc5, but I don't have a testcase. > Maybe some submit flags that overwrite the BCL or RCL to do an infinite > loop?

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH] usbip: list: don't list devices attached to vhci_hcd

by Shuah Khan

usbip host lists devices attached to vhci_hcd on the same server when user does attach over localhost or specifies the server as the remote. usbip attach -r localhost -b busid or usbip attach -r servername (or server IP) Fix it to check and not list devices that are attached to vhci_hcd. Cc: stable(a)vger.kernel.org Signed-off-by: Shuah Khan <shuahkh(a)osg.samsung.com> --- tools/usb/usbip/src/usbip_list.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/tools/usb/usbip/src/usbip_list.c b/tools/usb/usbip/src/usbip_list.c index f1b38e866dd7..d65a9f444174 100644 --- a/tools/usb/usbip/src/usbip_list.c +++ b/tools/usb/usbip/src/usbip_list.c @@ -187,6 +187,7 @@ static int list_devices(bool parsable) const char *busid; char product_name[128]; int ret = -1; + const char *devpath; /* Create libudev context. */ udev = udev_new(); @@ -209,6 +210,14 @@ static int list_devices(bool parsable) path = udev_list_entry_get_name(dev_list_entry); dev = udev_device_new_from_syspath(udev, path); + /* Ignore devices attached to vhci_hcd */ + devpath = udev_device_get_devpath(dev); + if (strstr(devpath, USBIP_VHCI_DRV_NAME)) { + dbg("Skip the device %s already attached to %s\n", + devpath, USBIP_VHCI_DRV_NAME); + continue; + } + /* Get device information. */ idVendor = udev_device_get_sysattr_value(dev, "idVendor"); idProduct = udev_device_get_sysattr_value(dev, "idProduct"); -- 2.14.1

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH] usbip: prevent bind loops on devices attached to vhci_hcd

by Shuah Khan

usbip host binds to devices attached to vhci_hcd on the same server when user does attach over localhost or specifies the server as the remote. usbip attach -r localhost -b busid or usbip attach -r servername (or server IP) Unbind followed by bind works, however device is left in a bad state with accesses via the attached busid result in errors and system hangs during shutdown. Fix it to check and bail out if the device is already attached to vhci_hcd. Cc: stable(a)vger.kernel.org Signed-off-by: Shuah Khan <shuahkh(a)osg.samsung.com> --- tools/usb/usbip/src/usbip_bind.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/tools/usb/usbip/src/usbip_bind.c b/tools/usb/usbip/src/usbip_bind.c index fa46141ae68b..e121cfb1746a 100644 --- a/tools/usb/usbip/src/usbip_bind.c +++ b/tools/usb/usbip/src/usbip_bind.c @@ -144,6 +144,7 @@ static int bind_device(char *busid) int rc; struct udev *udev; struct udev_device *dev; + const char *devpath; /* Check whether the device with this bus ID exists. */ udev = udev_new(); @@ -152,8 +153,16 @@ static int bind_device(char *busid) err("device with the specified bus ID does not exist"); return -1; } + devpath = udev_device_get_devpath(dev); udev_unref(udev); + /* If the device is already attached to vhci_hcd - bail out */ + if (strstr(devpath, USBIP_VHCI_DRV_NAME)) { + err("bind loop detected: device: %s is attached to %s\n", + devpath, USBIP_VHCI_DRV_NAME); + return -1; + } + rc = unbind_other(busid); if (rc == UNBIND_ST_FAILED) { err("could not unbind driver from device on busid %s", busid); -- 2.14.1

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH] ubifs: Check ubifs_wbuf_sync() return code

by Richard Weinberger

If ubifs_wbuf_sync() fails we must not write a master node with the dirty marker cleared. Otherwise it is possible that in case of an IO error while syncing we mark the filesystem as clean and UBIFS refuses to recover upon next mount. Cc: <stable(a)vger.kernel.org> Fixes: 1e51764a3c2a ("UBIFS: add new flash file system") Signed-off-by: Richard Weinberger <richard(a)nod.at> --- fs/ubifs/super.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c index 0beb285b143d..468bca452d0a 100644 --- a/fs/ubifs/super.c +++ b/fs/ubifs/super.c @@ -1739,8 +1739,11 @@ static void ubifs_remount_ro(struct ubifs_info *c) dbg_save_space_info(c); - for (i = 0; i < c->jhead_cnt; i++) - ubifs_wbuf_sync(&c->jheads[i].wbuf); + for (i = 0; i < c->jhead_cnt; i++) { + err = ubifs_wbuf_sync(&c->jheads[i].wbuf); + if (err) + ubifs_ro_mode(c, err); + } c->mst_node->flags &= ~cpu_to_le32(UBIFS_MST_DIRTY); c->mst_node->flags |= cpu_to_le32(UBIFS_MST_NO_ORPHS); @@ -1806,8 +1809,11 @@ static void ubifs_put_super(struct super_block *sb) int err; /* Synchronize write-buffers */ - for (i = 0; i < c->jhead_cnt; i++) - ubifs_wbuf_sync(&c->jheads[i].wbuf); + for (i = 0; i < c->jhead_cnt; i++) { + err = ubifs_wbuf_sync(&c->jheads[i].wbuf); + if (err) + ubifs_ro_mode(c, err); + } /* * We are being cleanly unmounted which means the -- 2.13.6

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] v4.4.112 build: 0 failures 0 warnings (v4.4.112)

by Build bot for Mark Brown

Tree/Branch: v4.4.112 Git describe: v4.4.112 Commit: 42375c1120 Linux 4.4.112 Build Time: 79 min 23 sec Passed: 9 / 9 (100.00 %) Failed: 0 / 9 ( 0.00 %) Errors: 0 Warnings: 0 Section Mismatches: 0 ------------------------------------------------------------------------------- defconfigs with issues (other than build errors): ------------------------------------------------------------------------------- =============================================================================== Detailed per-defconfig build reports below: ------------------------------------------------------------------------------- Passed with no errors, warnings or mismatches: arm64-allnoconfig arm64-allmodconfig arm-multi_v5_defconfig arm-multi_v7_defconfig x86_64-defconfig arm-allmodconfig arm-allnoconfig x86_64-allnoconfig arm64-defconfig close failed in file object destructor: sys.excepthook is missing lost sys.stderr

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Please apply commit 1b5c7ef3d0d0 ("drm/nouveau/disp/gf119: add missing drive vfunc ptr") on 4.14

by Sven Joachim

Hi Greg, Could you please pick up commit 1b5c7ef3d0d0 ("drm/nouveau/disp/gf119: add missing drive vfunc ptr") for the 4.14 series? It fixes https://bugs.freedesktop.org/show_bug.cgi?id=103421 which seems to break nouveau for everyone with a GF119 card. This problem has also been reported twice in Debian[1,2], and the Debian kernel team has already applied the patch. Cheers, Sven 1. https://bugs.debian.org/880660 2. https://bugs.debian.org/886727

7 years, 8 months

2
1
0 0

[Linux-stable-mirror] Patch "drm/nouveau/disp/gf119: add missing drive vfunc ptr" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled drm/nouveau/disp/gf119: add missing drive vfunc ptr to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: drm-nouveau-disp-gf119-add-missing-drive-vfunc-ptr.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 1b5c7ef3d0d0610bda9b63263f7c5b7178d11015 Mon Sep 17 00:00:00 2001 From: Rob Clark <robdclark(a)gmail.com> Date: Sat, 6 Jan 2018 10:59:41 -0500 Subject: drm/nouveau/disp/gf119: add missing drive vfunc ptr From: Rob Clark <robdclark(a)gmail.com> commit 1b5c7ef3d0d0610bda9b63263f7c5b7178d11015 upstream. Fixes broken dp on GF119: Call Trace: ? nvkm_dp_train_drive+0x183/0x2c0 [nouveau] nvkm_dp_acquire+0x4f3/0xcd0 [nouveau] nv50_disp_super_2_2+0x5d/0x470 [nouveau] ? nvkm_devinit_pll_set+0xf/0x20 [nouveau] gf119_disp_super+0x19c/0x2f0 [nouveau] process_one_work+0x193/0x3c0 worker_thread+0x35/0x3b0 kthread+0x125/0x140 ? process_one_work+0x3c0/0x3c0 ? kthread_park+0x60/0x60 ret_from_fork+0x25/0x30 Code: Bad RIP value. RIP: (null) RSP: ffffb1e243e4bc38 CR2: 0000000000000000 Fixes: af85389c614a drm/nouveau/disp: shuffle functions around Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103421 Signed-off-by: Rob Clark <robdclark(a)gmail.com> Signed-off-by: Ben Skeggs <bskeggs(a)redhat.com> Cc: Sven Joachim <svenjoac(a)gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c | 1 + 1 file changed, 1 insertion(+) --- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c @@ -174,6 +174,7 @@ gf119_sor = { .links = gf119_sor_dp_links, .power = g94_sor_dp_power, .pattern = gf119_sor_dp_pattern, + .drive = gf119_sor_dp_drive, .vcpi = gf119_sor_dp_vcpi, .audio = gf119_sor_dp_audio, .audio_sym = gf119_sor_dp_audio_sym, Patches currently in stable-queue which might be from robdclark(a)gmail.com are queue-4.14/drm-nouveau-disp-gf119-add-missing-drive-vfunc-ptr.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH 3.16-stable 0/5] Add PVCLOCK_FIXMAP user mapping

by Juerg Haefliger

Patch 3 fixes the userspace segfaults caused by the PVCLOCK_FIXMAP user mapping (which I've copied from the 3.2 kaiser patchset). I don't claim I fully understand this so the fix might be too broad. Andrea Arcangeli (1): x86/mm/kaiser: remove paravirt clock warning Juerg Haefliger (3): Revert "x86: kvmclock: Disable use from vDSO if KPTI is enabled" x86/kaiser: Add PVCLOCK_FIXMAP user mapping x86/kaiser: Fix segfaults caused by the PVCLOCK_FIXMAP user mapping Marcelo Tosatti (1): kvmclock: export kvmclock clocksource and data pointers arch/x86/include/asm/kvmclock.h | 6 ++++++ arch/x86/kernel/kvmclock.c | 9 +++------ arch/x86/mm/kaiser.c | 12 +++++++++++- 3 files changed, 20 insertions(+), 7 deletions(-) create mode 100644 arch/x86/include/asm/kvmclock.h -- 2.14.1

7 years, 8 months

1
5
0 0

[Linux-stable-mirror] v4.14.14 build: 0 failures 0 warnings (v4.14.14)

by Build bot for Mark Brown

Tree/Branch: v4.14.14 Git describe: v4.14.14 Commit: 9c0bf98471 Linux 4.14.14 Build Time: 108 min 51 sec Passed: 10 / 10 (100.00 %) Failed: 0 / 10 ( 0.00 %) Errors: 0 Warnings: 0 Section Mismatches: 0 ------------------------------------------------------------------------------- defconfigs with issues (other than build errors): ------------------------------------------------------------------------------- =============================================================================== Detailed per-defconfig build reports below: ------------------------------------------------------------------------------- Passed with no errors, warnings or mismatches: arm64-allnoconfig arm64-allmodconfig arm-multi_v5_defconfig arm-multi_v7_defconfig x86_64-defconfig arm-allmodconfig arm-allnoconfig x86_64-allnoconfig arm-multi_v4t_defconfig arm64-defconfig close failed in file object destructor: sys.excepthook is missing lost sys.stderr

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "libnvdimm, btt: Fix an incompatibility in the log layout" has been added to the 4.9-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled libnvdimm, btt: Fix an incompatibility in the log layout to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: libnvdimm-btt-fix-an-incompatibility-in-the-log-layout.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 24e3a7fb60a9187e5df90e5fa655ffc94b9c4f77 Mon Sep 17 00:00:00 2001 From: Vishal Verma <vishal.l.verma(a)intel.com> Date: Mon, 18 Dec 2017 09:28:39 -0700 Subject: libnvdimm, btt: Fix an incompatibility in the log layout From: Vishal Verma <vishal.l.verma(a)intel.com> commit 24e3a7fb60a9187e5df90e5fa655ffc94b9c4f77 upstream. Due to a spec misinterpretation, the Linux implementation of the BTT log area had different padding scheme from other implementations, such as UEFI and NVML. This fixes the padding scheme, and defaults to it for new BTT layouts. We attempt to detect the padding scheme in use when probing for an existing BTT. If we detect the older/incompatible scheme, we continue using it. Reported-by: Juston Li <juston.li(a)intel.com> Cc: Dan Williams <dan.j.williams(a)intel.com> Cc: <stable(a)vger.kernel.org> Fixes: 5212e11fde4d ("nd_btt: atomic sector updates") Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/nvdimm/btt.c | 203 ++++++++++++++++++++++++++++++++++++++++++--------- drivers/nvdimm/btt.h | 45 +++++++++++ 2 files changed, 212 insertions(+), 36 deletions(-) --- a/drivers/nvdimm/btt.c +++ b/drivers/nvdimm/btt.c @@ -183,13 +183,13 @@ static int btt_map_read(struct arena_inf return ret; } -static int btt_log_read_pair(struct arena_info *arena, u32 lane, - struct log_entry *ent) +static int btt_log_group_read(struct arena_info *arena, u32 lane, + struct log_group *log) { - WARN_ON(!ent); + WARN_ON(!log); return arena_read_bytes(arena, - arena->logoff + (2 * lane * LOG_ENT_SIZE), ent, - 2 * LOG_ENT_SIZE); + arena->logoff + (lane * LOG_GRP_SIZE), log, + LOG_GRP_SIZE); } static struct dentry *debugfs_root; @@ -229,6 +229,8 @@ static void arena_debugfs_init(struct ar debugfs_create_x64("logoff", S_IRUGO, d, &a->logoff); debugfs_create_x64("info2off", S_IRUGO, d, &a->info2off); debugfs_create_x32("flags", S_IRUGO, d, &a->flags); + debugfs_create_u32("log_index_0", S_IRUGO, d, &a->log_index[0]); + debugfs_create_u32("log_index_1", S_IRUGO, d, &a->log_index[1]); } static void btt_debugfs_init(struct btt *btt) @@ -247,6 +249,11 @@ static void btt_debugfs_init(struct btt } } +static u32 log_seq(struct log_group *log, int log_idx) +{ + return le32_to_cpu(log->ent[log_idx].seq); +} + /* * This function accepts two log entries, and uses the * sequence number to find the 'older' entry. @@ -256,8 +263,10 @@ static void btt_debugfs_init(struct btt * * TODO The logic feels a bit kludge-y. make it better.. */ -static int btt_log_get_old(struct log_entry *ent) +static int btt_log_get_old(struct arena_info *a, struct log_group *log) { + int idx0 = a->log_index[0]; + int idx1 = a->log_index[1]; int old; /* @@ -265,23 +274,23 @@ static int btt_log_get_old(struct log_en * the next time, the following logic works out to put this * (next) entry into [1] */ - if (ent[0].seq == 0) { - ent[0].seq = cpu_to_le32(1); + if (log_seq(log, idx0) == 0) { + log->ent[idx0].seq = cpu_to_le32(1); return 0; } - if (ent[0].seq == ent[1].seq) + if (log_seq(log, idx0) == log_seq(log, idx1)) return -EINVAL; - if (le32_to_cpu(ent[0].seq) + le32_to_cpu(ent[1].seq) > 5) + if (log_seq(log, idx0) + log_seq(log, idx1) > 5) return -EINVAL; - if (le32_to_cpu(ent[0].seq) < le32_to_cpu(ent[1].seq)) { - if (le32_to_cpu(ent[1].seq) - le32_to_cpu(ent[0].seq) == 1) + if (log_seq(log, idx0) < log_seq(log, idx1)) { + if ((log_seq(log, idx1) - log_seq(log, idx0)) == 1) old = 0; else old = 1; } else { - if (le32_to_cpu(ent[0].seq) - le32_to_cpu(ent[1].seq) == 1) + if ((log_seq(log, idx0) - log_seq(log, idx1)) == 1) old = 1; else old = 0; @@ -306,17 +315,18 @@ static int btt_log_read(struct arena_inf { int ret; int old_ent, ret_ent; - struct log_entry log[2]; + struct log_group log; - ret = btt_log_read_pair(arena, lane, log); + ret = btt_log_group_read(arena, lane, &log); if (ret) return -EIO; - old_ent = btt_log_get_old(log); + old_ent = btt_log_get_old(arena, &log); if (old_ent < 0 || old_ent > 1) { dev_info(to_dev(arena), "log corruption (%d): lane %d seq [%d, %d]\n", - old_ent, lane, log[0].seq, log[1].seq); + old_ent, lane, log.ent[arena->log_index[0]].seq, + log.ent[arena->log_index[1]].seq); /* TODO set error state? */ return -EIO; } @@ -324,7 +334,7 @@ static int btt_log_read(struct arena_inf ret_ent = (old_flag ? old_ent : (1 - old_ent)); if (ent != NULL) - memcpy(ent, &log[ret_ent], LOG_ENT_SIZE); + memcpy(ent, &log.ent[arena->log_index[ret_ent]], LOG_ENT_SIZE); return ret_ent; } @@ -338,17 +348,13 @@ static int __btt_log_write(struct arena_ u32 sub, struct log_entry *ent) { int ret; - /* - * Ignore the padding in log_entry for calculating log_half. - * The entry is 'committed' when we write the sequence number, - * and we want to ensure that that is the last thing written. - * We don't bother writing the padding as that would be extra - * media wear and write amplification - */ - unsigned int log_half = (LOG_ENT_SIZE - 2 * sizeof(u64)) / 2; - u64 ns_off = arena->logoff + (((2 * lane) + sub) * LOG_ENT_SIZE); + u32 group_slot = arena->log_index[sub]; + unsigned int log_half = LOG_ENT_SIZE / 2; void *src = ent; + u64 ns_off; + ns_off = arena->logoff + (lane * LOG_GRP_SIZE) + + (group_slot * LOG_ENT_SIZE); /* split the 16B write into atomic, durable halves */ ret = arena_write_bytes(arena, ns_off, src, log_half); if (ret) @@ -419,16 +425,16 @@ static int btt_log_init(struct arena_inf { int ret; u32 i; - struct log_entry log, zerolog; + struct log_entry ent, zerolog; memset(&zerolog, 0, sizeof(zerolog)); for (i = 0; i < arena->nfree; i++) { - log.lba = cpu_to_le32(i); - log.old_map = cpu_to_le32(arena->external_nlba + i); - log.new_map = cpu_to_le32(arena->external_nlba + i); - log.seq = cpu_to_le32(LOG_SEQ_INIT); - ret = __btt_log_write(arena, i, 0, &log); + ent.lba = cpu_to_le32(i); + ent.old_map = cpu_to_le32(arena->external_nlba + i); + ent.new_map = cpu_to_le32(arena->external_nlba + i); + ent.seq = cpu_to_le32(LOG_SEQ_INIT); + ret = __btt_log_write(arena, i, 0, &ent); if (ret) return ret; ret = __btt_log_write(arena, i, 1, &zerolog); @@ -490,6 +496,123 @@ static int btt_freelist_init(struct aren return 0; } +static bool ent_is_padding(struct log_entry *ent) +{ + return (ent->lba == 0) && (ent->old_map == 0) && (ent->new_map == 0) + && (ent->seq == 0); +} + +/* + * Detecting valid log indices: We read a log group (see the comments in btt.h + * for a description of a 'log_group' and its 'slots'), and iterate over its + * four slots. We expect that a padding slot will be all-zeroes, and use this + * to detect a padding slot vs. an actual entry. + * + * If a log_group is in the initial state, i.e. hasn't been used since the + * creation of this BTT layout, it will have three of the four slots with + * zeroes. We skip over these log_groups for the detection of log_index. If + * all log_groups are in the initial state (i.e. the BTT has never been + * written to), it is safe to assume the 'new format' of log entries in slots + * (0, 1). + */ +static int log_set_indices(struct arena_info *arena) +{ + bool idx_set = false, initial_state = true; + int ret, log_index[2] = {-1, -1}; + u32 i, j, next_idx = 0; + struct log_group log; + u32 pad_count = 0; + + for (i = 0; i < arena->nfree; i++) { + ret = btt_log_group_read(arena, i, &log); + if (ret < 0) + return ret; + + for (j = 0; j < 4; j++) { + if (!idx_set) { + if (ent_is_padding(&log.ent[j])) { + pad_count++; + continue; + } else { + /* Skip if index has been recorded */ + if ((next_idx == 1) && + (j == log_index[0])) + continue; + /* valid entry, record index */ + log_index[next_idx] = j; + next_idx++; + } + if (next_idx == 2) { + /* two valid entries found */ + idx_set = true; + } else if (next_idx > 2) { + /* too many valid indices */ + return -ENXIO; + } + } else { + /* + * once the indices have been set, just verify + * that all subsequent log groups are either in + * their initial state or follow the same + * indices. + */ + if (j == log_index[0]) { + /* entry must be 'valid' */ + if (ent_is_padding(&log.ent[j])) + return -ENXIO; + } else if (j == log_index[1]) { + ; + /* + * log_index[1] can be padding if the + * lane never got used and it is still + * in the initial state (three 'padding' + * entries) + */ + } else { + /* entry must be invalid (padding) */ + if (!ent_is_padding(&log.ent[j])) + return -ENXIO; + } + } + } + /* + * If any of the log_groups have more than one valid, + * non-padding entry, then the we are no longer in the + * initial_state + */ + if (pad_count < 3) + initial_state = false; + pad_count = 0; + } + + if (!initial_state && !idx_set) + return -ENXIO; + + /* + * If all the entries in the log were in the initial state, + * assume new padding scheme + */ + if (initial_state) + log_index[1] = 1; + + /* + * Only allow the known permutations of log/padding indices, + * i.e. (0, 1), and (0, 2) + */ + if ((log_index[0] == 0) && ((log_index[1] == 1) || (log_index[1] == 2))) + ; /* known index possibilities */ + else { + dev_err(to_dev(arena), "Found an unknown padding scheme\n"); + return -ENXIO; + } + + arena->log_index[0] = log_index[0]; + arena->log_index[1] = log_index[1]; + dev_dbg(to_dev(arena), "log_index_0 = %d\n", log_index[0]); + dev_dbg(to_dev(arena), "log_index_1 = %d\n", log_index[1]); + return 0; +} + static int btt_rtt_init(struct arena_info *arena) { arena->rtt = kcalloc(arena->nfree, sizeof(u32), GFP_KERNEL); @@ -545,8 +668,7 @@ static struct arena_info *alloc_arena(st available -= 2 * BTT_PG_SIZE; /* The log takes a fixed amount of space based on nfree */ - logsize = roundup(2 * arena->nfree * sizeof(struct log_entry), - BTT_PG_SIZE); + logsize = roundup(arena->nfree * LOG_GRP_SIZE, BTT_PG_SIZE); available -= logsize; /* Calculate optimal split between map and data area */ @@ -563,6 +685,10 @@ static struct arena_info *alloc_arena(st arena->mapoff = arena->dataoff + datasize; arena->logoff = arena->mapoff + mapsize; arena->info2off = arena->logoff + logsize; + + /* Default log indices are (0,1) */ + arena->log_index[0] = 0; + arena->log_index[1] = 1; return arena; } @@ -653,6 +779,13 @@ static int discover_arenas(struct btt *b arena->external_lba_start = cur_nlba; parse_arena_meta(arena, super, cur_off); + ret = log_set_indices(arena); + if (ret) { + dev_err(to_dev(arena), + "Unable to deduce log/padding indices\n"); + goto out; + } + ret = btt_freelist_init(arena); if (ret) goto out; --- a/drivers/nvdimm/btt.h +++ b/drivers/nvdimm/btt.h @@ -26,6 +26,7 @@ #define MAP_ERR_MASK (1 << MAP_ERR_SHIFT) #define MAP_LBA_MASK (~((1 << MAP_TRIM_SHIFT) | (1 << MAP_ERR_SHIFT))) #define MAP_ENT_NORMAL 0xC0000000 +#define LOG_GRP_SIZE sizeof(struct log_group) #define LOG_ENT_SIZE sizeof(struct log_entry) #define ARENA_MIN_SIZE (1UL << 24) /* 16 MB */ #define ARENA_MAX_SIZE (1ULL << 39) /* 512 GB */ @@ -44,12 +45,52 @@ enum btt_init_state { INIT_READY }; +/* + * A log group represents one log 'lane', and consists of four log entries. + * Two of the four entries are valid entries, and the remaining two are + * padding. Due to an old bug in the padding location, we need to perform a + * test to determine the padding scheme being used, and use that scheme + * thereafter. + * + * In kernels prior to 4.15, 'log group' would have actual log entries at + * indices (0, 2) and padding at indices (1, 3), where as the correct/updated + * format has log entries at indices (0, 1) and padding at indices (2, 3). + * + * Old (pre 4.15) format: + * +-----------------+-----------------+ + * | ent[0] | ent[1] | + * | 16B | 16B | + * | lba/old/new/seq | pad | + * +-----------------------------------+ + * | ent[2] | ent[3] | + * | 16B | 16B | + * | lba/old/new/seq | pad | + * +-----------------+-----------------+ + * + * New format: + * +-----------------+-----------------+ + * | ent[0] | ent[1] | + * | 16B | 16B | + * | lba/old/new/seq | lba/old/new/seq | + * +-----------------------------------+ + * | ent[2] | ent[3] | + * | 16B | 16B | + * | pad | pad | + * +-----------------+-----------------+ + * + * We detect during start-up which format is in use, and set + * arena->log_index[(0, 1)] with the detected format. + */ + struct log_entry { __le32 lba; __le32 old_map; __le32 new_map; __le32 seq; - __le64 padding[2]; +}; + +struct log_group { + struct log_entry ent[4]; }; struct btt_sb { @@ -117,6 +158,7 @@ struct aligned_lock { * @list: List head for list of arenas * @debugfs_dir: Debugfs dentry * @flags: Arena flags - may signify error states. + * @log_index: Indices of the valid log entries in a log_group * * arena_info is a per-arena handle. Once an arena is narrowed down for an * IO, this struct is passed around for the duration of the IO. @@ -147,6 +189,7 @@ struct arena_info { struct dentry *debugfs_dir; /* Arena flags */ u32 flags; + int log_index[2]; }; /** Patches currently in stable-queue which might be from vishal.l.verma(a)intel.com are queue-4.9/libnvdimm-btt-fix-an-incompatibility-in-the-log-layout.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH for v4.9-stable] sched: fix softirq time accounting

by Rabin Vincent

From: Rabin Vincent <rabinv(a)axis.com> softirq time accounting is broken on v4.9.x if ksoftirqd runs. With CONFIG_IRQ_TIME_ACCOUNTING=y # CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set this test code: struct tasklet_struct tasklet; static void delay_tasklet(unsigned long data) { udelay(10); tasklet_schedule(&tasklet); } tasklet_init(&tasklet, delay_tasklet, 0); tasklet_schedule(&tasklet); results in: $ while :; do grep cpu0 /proc/stat; done cpu0 5 0 80 25 16 107 1 0 0 0 cpu0 5 0 80 25 16 107 0 0 0 0 cpu0 5 0 80 25 16 107 0 0 0 0 cpu0 5 0 80 25 16 107 0 0 0 0 cpu0 5 0 81 25 16 107 0 0 0 0 cpu0 5 0 81 25 16 107 1 0 0 0 cpu0 5 0 81 25 16 108 18446744073709551615 0 0 0 cpu0 5 0 81 25 16 108 18446744073709551615 0 0 0 cpu0 5 0 81 25 16 108 18446744073709551615 0 0 0 cpu0 5 0 81 25 16 108 0 0 0 0 cpu0 6 0 81 25 16 108 0 0 0 0 cpu0 6 0 81 25 16 108 0 0 0 0 As can be seen, the softirq numbers are totally bogus. When ksoftirq is running, irqtime_account_process_tick() increments cpustat[CPUSTAT_SOFTIRQ]. This causes the "nsecs_to_cputime64(irqtime) - cpustat[CPUSTAT_SOFTIRQ]" calculation in irqtime_account_update() to underflow the next time a softirq is handled leading to the above values. The underflow bug was added by 57430218317e5b280 ("sched/cputime: Count actually elapsed irq & softirq time"). But ksoftirqd accounting was wrong even in earlier kernels. In earlier kernels, after a ksoftirq run, the kernel would simply stop accounting softirq time spent outside of ksoftirqd until that (accumulated) time exceeded the time for which ksofirqd previously had run. Fix both the underflow and the wrong accounting by using a counter specifically for the non-ksoftirqd softirq case. This code has been fixed in current mainline by a499a5a14db ("sched/cputime: Increment kcpustat directly on irqtime account") [note also the followup 25e2d8c1b9e327e ("sched/cputime: Fix ksoftirqd cputime accounting regression")], but that patch is a part of the many changes for eliminating of cputime_t so it does not seem suitable for backport. Signed-off-by: Rabin Vincent <rabinv(a)axis.com> --- include/linux/kernel_stat.h | 1 + kernel/sched/cputime.c | 9 ++++++++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h index 44fda64..d0826f1 100644 --- a/include/linux/kernel_stat.h +++ b/include/linux/kernel_stat.h @@ -33,6 +33,7 @@ enum cpu_usage_stat { struct kernel_cpustat { u64 cpustat[NR_STATS]; + u64 softirq_no_ksoftirqd; }; struct kernel_stat { diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 5ebee31..1b5a9e6 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -73,12 +73,19 @@ EXPORT_SYMBOL_GPL(irqtime_account_irq); static cputime_t irqtime_account_update(u64 irqtime, int idx, cputime_t maxtime) { u64 *cpustat = kcpustat_this_cpu->cpustat; + u64 base = cpustat[idx]; cputime_t irq_cputime; - irq_cputime = nsecs_to_cputime64(irqtime) - cpustat[idx]; + if (idx == CPUTIME_SOFTIRQ) + base = kcpustat_this_cpu->softirq_no_ksoftirqd; + + irq_cputime = nsecs_to_cputime64(irqtime) - base; irq_cputime = min(irq_cputime, maxtime); cpustat[idx] += irq_cputime; + if (idx == CPUTIME_SOFTIRQ) + kcpustat_this_cpu->softirq_no_ksoftirqd += irq_cputime; + return irq_cputime; } -- 2.1.4

7 years, 8 months

2
2
0 0

[Linux-stable-mirror] FAILED: patch "[PATCH] libnvdimm, btt: Fix an incompatibility in the log layout" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 24e3a7fb60a9187e5df90e5fa655ffc94b9c4f77 Mon Sep 17 00:00:00 2001 From: Vishal Verma <vishal.l.verma(a)intel.com> Date: Mon, 18 Dec 2017 09:28:39 -0700 Subject: [PATCH] libnvdimm, btt: Fix an incompatibility in the log layout Due to a spec misinterpretation, the Linux implementation of the BTT log area had different padding scheme from other implementations, such as UEFI and NVML. This fixes the padding scheme, and defaults to it for new BTT layouts. We attempt to detect the padding scheme in use when probing for an existing BTT. If we detect the older/incompatible scheme, we continue using it. Reported-by: Juston Li <juston.li(a)intel.com> Cc: Dan Williams <dan.j.williams(a)intel.com> Cc: <stable(a)vger.kernel.org> Fixes: 5212e11fde4d ("nd_btt: atomic sector updates") Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c index e949e3302af4..c586bcdb5190 100644 --- a/drivers/nvdimm/btt.c +++ b/drivers/nvdimm/btt.c @@ -211,12 +211,12 @@ static int btt_map_read(struct arena_info *arena, u32 lba, u32 *mapping, return ret; } -static int btt_log_read_pair(struct arena_info *arena, u32 lane, - struct log_entry *ent) +static int btt_log_group_read(struct arena_info *arena, u32 lane, + struct log_group *log) { return arena_read_bytes(arena, - arena->logoff + (2 * lane * LOG_ENT_SIZE), ent, - 2 * LOG_ENT_SIZE, 0); + arena->logoff + (lane * LOG_GRP_SIZE), log, + LOG_GRP_SIZE, 0); } static struct dentry *debugfs_root; @@ -256,6 +256,8 @@ static void arena_debugfs_init(struct arena_info *a, struct dentry *parent, debugfs_create_x64("logoff", S_IRUGO, d, &a->logoff); debugfs_create_x64("info2off", S_IRUGO, d, &a->info2off); debugfs_create_x32("flags", S_IRUGO, d, &a->flags); + debugfs_create_u32("log_index_0", S_IRUGO, d, &a->log_index[0]); + debugfs_create_u32("log_index_1", S_IRUGO, d, &a->log_index[1]); } static void btt_debugfs_init(struct btt *btt) @@ -274,6 +276,11 @@ static void btt_debugfs_init(struct btt *btt) } } +static u32 log_seq(struct log_group *log, int log_idx) +{ + return le32_to_cpu(log->ent[log_idx].seq); +} + /* * This function accepts two log entries, and uses the * sequence number to find the 'older' entry. @@ -283,8 +290,10 @@ static void btt_debugfs_init(struct btt *btt) * * TODO The logic feels a bit kludge-y. make it better.. */ -static int btt_log_get_old(struct log_entry *ent) +static int btt_log_get_old(struct arena_info *a, struct log_group *log) { + int idx0 = a->log_index[0]; + int idx1 = a->log_index[1]; int old; /* @@ -292,23 +301,23 @@ static int btt_log_get_old(struct log_entry *ent) * the next time, the following logic works out to put this * (next) entry into [1] */ - if (ent[0].seq == 0) { - ent[0].seq = cpu_to_le32(1); + if (log_seq(log, idx0) == 0) { + log->ent[idx0].seq = cpu_to_le32(1); return 0; } - if (ent[0].seq == ent[1].seq) + if (log_seq(log, idx0) == log_seq(log, idx1)) return -EINVAL; - if (le32_to_cpu(ent[0].seq) + le32_to_cpu(ent[1].seq) > 5) + if (log_seq(log, idx0) + log_seq(log, idx1) > 5) return -EINVAL; - if (le32_to_cpu(ent[0].seq) < le32_to_cpu(ent[1].seq)) { - if (le32_to_cpu(ent[1].seq) - le32_to_cpu(ent[0].seq) == 1) + if (log_seq(log, idx0) < log_seq(log, idx1)) { + if ((log_seq(log, idx1) - log_seq(log, idx0)) == 1) old = 0; else old = 1; } else { - if (le32_to_cpu(ent[0].seq) - le32_to_cpu(ent[1].seq) == 1) + if ((log_seq(log, idx0) - log_seq(log, idx1)) == 1) old = 1; else old = 0; @@ -328,17 +337,18 @@ static int btt_log_read(struct arena_info *arena, u32 lane, { int ret; int old_ent, ret_ent; - struct log_entry log[2]; + struct log_group log; - ret = btt_log_read_pair(arena, lane, log); + ret = btt_log_group_read(arena, lane, &log); if (ret) return -EIO; - old_ent = btt_log_get_old(log); + old_ent = btt_log_get_old(arena, &log); if (old_ent < 0 || old_ent > 1) { dev_err(to_dev(arena), "log corruption (%d): lane %d seq [%d, %d]\n", - old_ent, lane, log[0].seq, log[1].seq); + old_ent, lane, log.ent[arena->log_index[0]].seq, + log.ent[arena->log_index[1]].seq); /* TODO set error state? */ return -EIO; } @@ -346,7 +356,7 @@ static int btt_log_read(struct arena_info *arena, u32 lane, ret_ent = (old_flag ? old_ent : (1 - old_ent)); if (ent != NULL) - memcpy(ent, &log[ret_ent], LOG_ENT_SIZE); + memcpy(ent, &log.ent[arena->log_index[ret_ent]], LOG_ENT_SIZE); return ret_ent; } @@ -360,17 +370,13 @@ static int __btt_log_write(struct arena_info *arena, u32 lane, u32 sub, struct log_entry *ent, unsigned long flags) { int ret; - /* - * Ignore the padding in log_entry for calculating log_half. - * The entry is 'committed' when we write the sequence number, - * and we want to ensure that that is the last thing written. - * We don't bother writing the padding as that would be extra - * media wear and write amplification - */ - unsigned int log_half = (LOG_ENT_SIZE - 2 * sizeof(u64)) / 2; - u64 ns_off = arena->logoff + (((2 * lane) + sub) * LOG_ENT_SIZE); + u32 group_slot = arena->log_index[sub]; + unsigned int log_half = LOG_ENT_SIZE / 2; void *src = ent; + u64 ns_off; + ns_off = arena->logoff + (lane * LOG_GRP_SIZE) + + (group_slot * LOG_ENT_SIZE); /* split the 16B write into atomic, durable halves */ ret = arena_write_bytes(arena, ns_off, src, log_half, flags); if (ret) @@ -453,7 +459,7 @@ static int btt_log_init(struct arena_info *arena) { size_t logsize = arena->info2off - arena->logoff; size_t chunk_size = SZ_4K, offset = 0; - struct log_entry log; + struct log_entry ent; void *zerobuf; int ret; u32 i; @@ -485,11 +491,11 @@ static int btt_log_init(struct arena_info *arena) } for (i = 0; i < arena->nfree; i++) { - log.lba = cpu_to_le32(i); - log.old_map = cpu_to_le32(arena->external_nlba + i); - log.new_map = cpu_to_le32(arena->external_nlba + i); - log.seq = cpu_to_le32(LOG_SEQ_INIT); - ret = __btt_log_write(arena, i, 0, &log, 0); + ent.lba = cpu_to_le32(i); + ent.old_map = cpu_to_le32(arena->external_nlba + i); + ent.new_map = cpu_to_le32(arena->external_nlba + i); + ent.seq = cpu_to_le32(LOG_SEQ_INIT); + ret = __btt_log_write(arena, i, 0, &ent, 0); if (ret) goto free; } @@ -594,6 +600,123 @@ static int btt_freelist_init(struct arena_info *arena) return 0; } +static bool ent_is_padding(struct log_entry *ent) +{ + return (ent->lba == 0) && (ent->old_map == 0) && (ent->new_map == 0) + && (ent->seq == 0); +} + +/* + * Detecting valid log indices: We read a log group (see the comments in btt.h + * for a description of a 'log_group' and its 'slots'), and iterate over its + * four slots. We expect that a padding slot will be all-zeroes, and use this + * to detect a padding slot vs. an actual entry. + * + * If a log_group is in the initial state, i.e. hasn't been used since the + * creation of this BTT layout, it will have three of the four slots with + * zeroes. We skip over these log_groups for the detection of log_index. If + * all log_groups are in the initial state (i.e. the BTT has never been + * written to), it is safe to assume the 'new format' of log entries in slots + * (0, 1). + */ +static int log_set_indices(struct arena_info *arena) +{ + bool idx_set = false, initial_state = true; + int ret, log_index[2] = {-1, -1}; + u32 i, j, next_idx = 0; + struct log_group log; + u32 pad_count = 0; + + for (i = 0; i < arena->nfree; i++) { + ret = btt_log_group_read(arena, i, &log); + if (ret < 0) + return ret; + + for (j = 0; j < 4; j++) { + if (!idx_set) { + if (ent_is_padding(&log.ent[j])) { + pad_count++; + continue; + } else { + /* Skip if index has been recorded */ + if ((next_idx == 1) && + (j == log_index[0])) + continue; + /* valid entry, record index */ + log_index[next_idx] = j; + next_idx++; + } + if (next_idx == 2) { + /* two valid entries found */ + idx_set = true; + } else if (next_idx > 2) { + /* too many valid indices */ + return -ENXIO; + } + } else { + /* + * once the indices have been set, just verify + * that all subsequent log groups are either in + * their initial state or follow the same + * indices. + */ + if (j == log_index[0]) { + /* entry must be 'valid' */ + if (ent_is_padding(&log.ent[j])) + return -ENXIO; + } else if (j == log_index[1]) { + ; + /* + * log_index[1] can be padding if the + * lane never got used and it is still + * in the initial state (three 'padding' + * entries) + */ + } else { + /* entry must be invalid (padding) */ + if (!ent_is_padding(&log.ent[j])) + return -ENXIO; + } + } + } + /* + * If any of the log_groups have more than one valid, + * non-padding entry, then the we are no longer in the + * initial_state + */ + if (pad_count < 3) + initial_state = false; + pad_count = 0; + } + + if (!initial_state && !idx_set) + return -ENXIO; + + /* + * If all the entries in the log were in the initial state, + * assume new padding scheme + */ + if (initial_state) + log_index[1] = 1; + + /* + * Only allow the known permutations of log/padding indices, + * i.e. (0, 1), and (0, 2) + */ + if ((log_index[0] == 0) && ((log_index[1] == 1) || (log_index[1] == 2))) + ; /* known index possibilities */ + else { + dev_err(to_dev(arena), "Found an unknown padding scheme\n"); + return -ENXIO; + } + + arena->log_index[0] = log_index[0]; + arena->log_index[1] = log_index[1]; + dev_dbg(to_dev(arena), "log_index_0 = %d\n", log_index[0]); + dev_dbg(to_dev(arena), "log_index_1 = %d\n", log_index[1]); + return 0; +} + static int btt_rtt_init(struct arena_info *arena) { arena->rtt = kcalloc(arena->nfree, sizeof(u32), GFP_KERNEL); @@ -650,8 +773,7 @@ static struct arena_info *alloc_arena(struct btt *btt, size_t size, available -= 2 * BTT_PG_SIZE; /* The log takes a fixed amount of space based on nfree */ - logsize = roundup(2 * arena->nfree * sizeof(struct log_entry), - BTT_PG_SIZE); + logsize = roundup(arena->nfree * LOG_GRP_SIZE, BTT_PG_SIZE); available -= logsize; /* Calculate optimal split between map and data area */ @@ -668,6 +790,10 @@ static struct arena_info *alloc_arena(struct btt *btt, size_t size, arena->mapoff = arena->dataoff + datasize; arena->logoff = arena->mapoff + mapsize; arena->info2off = arena->logoff + logsize; + + /* Default log indices are (0,1) */ + arena->log_index[0] = 0; + arena->log_index[1] = 1; return arena; } @@ -758,6 +884,13 @@ static int discover_arenas(struct btt *btt) arena->external_lba_start = cur_nlba; parse_arena_meta(arena, super, cur_off); + ret = log_set_indices(arena); + if (ret) { + dev_err(to_dev(arena), + "Unable to deduce log/padding indices\n"); + goto out; + } + mutex_init(&arena->err_lock); ret = btt_freelist_init(arena); if (ret) diff --git a/drivers/nvdimm/btt.h b/drivers/nvdimm/btt.h index 884fbbbdd18a..db3cb6d4d0d4 100644 --- a/drivers/nvdimm/btt.h +++ b/drivers/nvdimm/btt.h @@ -27,6 +27,7 @@ #define MAP_ERR_MASK (1 << MAP_ERR_SHIFT) #define MAP_LBA_MASK (~((1 << MAP_TRIM_SHIFT) | (1 << MAP_ERR_SHIFT))) #define MAP_ENT_NORMAL 0xC0000000 +#define LOG_GRP_SIZE sizeof(struct log_group) #define LOG_ENT_SIZE sizeof(struct log_entry) #define ARENA_MIN_SIZE (1UL << 24) /* 16 MB */ #define ARENA_MAX_SIZE (1ULL << 39) /* 512 GB */ @@ -50,12 +51,52 @@ enum btt_init_state { INIT_READY }; +/* + * A log group represents one log 'lane', and consists of four log entries. + * Two of the four entries are valid entries, and the remaining two are + * padding. Due to an old bug in the padding location, we need to perform a + * test to determine the padding scheme being used, and use that scheme + * thereafter. + * + * In kernels prior to 4.15, 'log group' would have actual log entries at + * indices (0, 2) and padding at indices (1, 3), where as the correct/updated + * format has log entries at indices (0, 1) and padding at indices (2, 3). + * + * Old (pre 4.15) format: + * +-----------------+-----------------+ + * | ent[0] | ent[1] | + * | 16B | 16B | + * | lba/old/new/seq | pad | + * +-----------------------------------+ + * | ent[2] | ent[3] | + * | 16B | 16B | + * | lba/old/new/seq | pad | + * +-----------------+-----------------+ + * + * New format: + * +-----------------+-----------------+ + * | ent[0] | ent[1] | + * | 16B | 16B | + * | lba/old/new/seq | lba/old/new/seq | + * +-----------------------------------+ + * | ent[2] | ent[3] | + * | 16B | 16B | + * | pad | pad | + * +-----------------+-----------------+ + * + * We detect during start-up which format is in use, and set + * arena->log_index[(0, 1)] with the detected format. + */ + struct log_entry { __le32 lba; __le32 old_map; __le32 new_map; __le32 seq; - __le64 padding[2]; +}; + +struct log_group { + struct log_entry ent[4]; }; struct btt_sb { @@ -126,6 +167,7 @@ struct aligned_lock { * @debugfs_dir: Debugfs dentry * @flags: Arena flags - may signify error states. * @err_lock: Mutex for synchronizing error clearing. + * @log_index: Indices of the valid log entries in a log_group * * arena_info is a per-arena handle. Once an arena is narrowed down for an * IO, this struct is passed around for the duration of the IO. @@ -158,6 +200,7 @@ struct arena_info { /* Arena flags */ u32 flags; struct mutex err_lock; + int log_index[2]; }; /**

7 years, 8 months

3
2
0 0

[Linux-stable-mirror] v3.18.92 build: 0 failures 1 warnings (v3.18.92)

by Build bot for Mark Brown

Tree/Branch: v3.18.92 Git describe: v3.18.92 Commit: a5d35deca2 Linux 3.18.92 Build Time: 61 min 7 sec Passed: 9 / 9 (100.00 %) Failed: 0 / 9 ( 0.00 %) Errors: 0 Warnings: 1 Section Mismatches: 0 ------------------------------------------------------------------------------- defconfigs with issues (other than build errors): 2 warnings 0 mismatches : x86_64-defconfig ------------------------------------------------------------------------------- Warnings Summary: 1 2 ../include/linux/ftrace.h:632:36: warning: calling '__builtin_return_address' with a nonzero argument is unsafe [-Wframe-address] =============================================================================== Detailed per-defconfig build reports below: ------------------------------------------------------------------------------- x86_64-defconfig : PASS, 0 errors, 2 warnings, 0 section mismatches Warnings: ../include/linux/ftrace.h:632:36: warning: calling '__builtin_return_address' with a nonzero argument is unsafe [-Wframe-address] ../include/linux/ftrace.h:632:36: warning: calling '__builtin_return_address' with a nonzero argument is unsafe [-Wframe-address] ------------------------------------------------------------------------------- Passed with no errors, warnings or mismatches: arm64-allnoconfig arm64-allmodconfig arm-multi_v5_defconfig arm-multi_v7_defconfig arm-allmodconfig arm-allnoconfig x86_64-allnoconfig arm64-defconfig close failed in file object destructor: sys.excepthook is missing lost sys.stderr

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/spectre: Add boot time option to select Spectre v2 mitigation" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/spectre: Add boot time option to select Spectre v2 mitigation to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-spectre-add-boot-time-option-to-select-spectre-v2-mitigation.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From da285121560e769cc31797bba6422eea71d473e0 Mon Sep 17 00:00:00 2001 From: David Woodhouse <dwmw(a)amazon.co.uk> Date: Thu, 11 Jan 2018 21:46:26 +0000 Subject: x86/spectre: Add boot time option to select Spectre v2 mitigation From: David Woodhouse <dwmw(a)amazon.co.uk> commit da285121560e769cc31797bba6422eea71d473e0 upstream. Add a spectre_v2= option to select the mitigation used for the indirect branch speculation vulnerability. Currently, the only option available is retpoline, in its various forms. This will be expanded to cover the new IBRS/IBPB microcode features. The RETPOLINE_AMD feature relies on a serializing LFENCE for speculation control. For AMD hardware, only set RETPOLINE_AMD if LFENCE is a serializing instruction, which is indicated by the LFENCE_RDTSC feature. [ tglx: Folded back the LFENCE/AMD fixes and reworked it so IBRS integration becomes simple ] Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: gnomes(a)lxorguk.ukuu.org.uk Cc: Rik van Riel <riel(a)redhat.com> Cc: Andi Kleen <ak(a)linux.intel.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: thomas.lendacky(a)amd.com Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Jiri Kosina <jikos(a)kernel.org> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Kees Cook <keescook(a)google.com> Cc: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org> Cc: Paul Turner <pjt(a)google.com> Link: https://lkml.kernel.org/r/1515707194-20531-5-git-send-email-dwmw@amazon.co.… Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- Documentation/kernel-parameters.txt | 28 ++++++ arch/x86/include/asm/nospec-branch.h | 10 ++ arch/x86/kernel/cpu/bugs.c | 158 ++++++++++++++++++++++++++++++++++- arch/x86/kernel/cpu/common.c | 4 4 files changed, 195 insertions(+), 5 deletions(-) --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2452,6 +2452,11 @@ bytes respectively. Such letter suffixes nohugeiomap [KNL,x86] Disable kernel huge I/O mappings. + nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2 + (indirect branch prediction) vulnerability. System may + allow data leaks with this option, which is equivalent + to spectre_v2=off. + noxsave [BUGS=X86] Disables x86 extended register state save and restore using xsave. The kernel will fallback to enabling legacy floating-point and sse state. @@ -3594,6 +3599,29 @@ bytes respectively. Such letter suffixes sonypi.*= [HW] Sony Programmable I/O Control Device driver See Documentation/laptops/sonypi.txt + spectre_v2= [X86] Control mitigation of Spectre variant 2 + (indirect branch speculation) vulnerability. + + on - unconditionally enable + off - unconditionally disable + auto - kernel detects whether your CPU model is + vulnerable + + Selecting 'on' will, and 'auto' may, choose a + mitigation method at run time according to the + CPU, the available microcode, the setting of the + CONFIG_RETPOLINE configuration option, and the + compiler with which the kernel was built. + + Specific mitigations can also be selected manually: + + retpoline - replace indirect branches + retpoline,generic - google's original retpoline + retpoline,amd - AMD-specific minimal thunk + + Not specifying this option is equivalent to + spectre_v2=auto. + spia_io_base= [HW,MTD] spia_fio_base= spia_pedr= --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -102,5 +102,15 @@ # define THUNK_TARGET(addr) [thunk_target] "rm" (addr) #endif +/* The Spectre V2 mitigation variants */ +enum spectre_v2_mitigation { + SPECTRE_V2_NONE, + SPECTRE_V2_RETPOLINE_MINIMAL, + SPECTRE_V2_RETPOLINE_MINIMAL_AMD, + SPECTRE_V2_RETPOLINE_GENERIC, + SPECTRE_V2_RETPOLINE_AMD, + SPECTRE_V2_IBRS, +}; + #endif /* __ASSEMBLY__ */ #endif /* __NOSPEC_BRANCH_H__ */ --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -10,6 +10,9 @@ #include <linux/init.h> #include <linux/utsname.h> #include <linux/cpu.h> + +#include <asm/nospec-branch.h> +#include <asm/cmdline.h> #include <asm/bugs.h> #include <asm/processor.h> #include <asm/processor-flags.h> @@ -20,6 +23,8 @@ #include <asm/pgtable.h> #include <asm/cacheflush.h> +static void __init spectre_v2_select_mitigation(void); + void __init check_bugs(void) { identify_boot_cpu(); @@ -29,6 +34,9 @@ void __init check_bugs(void) print_cpu_info(&boot_cpu_data); } + /* Select the proper spectre mitigation before patching alternatives */ + spectre_v2_select_mitigation(); + #ifdef CONFIG_X86_32 /* * Check whether we are able to run this kernel safely on SMP. @@ -61,6 +69,153 @@ void __init check_bugs(void) #endif } +/* The kernel command line selection */ +enum spectre_v2_mitigation_cmd { + SPECTRE_V2_CMD_NONE, + SPECTRE_V2_CMD_AUTO, + SPECTRE_V2_CMD_FORCE, + SPECTRE_V2_CMD_RETPOLINE, + SPECTRE_V2_CMD_RETPOLINE_GENERIC, + SPECTRE_V2_CMD_RETPOLINE_AMD, +}; + +static const char *spectre_v2_strings[] = { + [SPECTRE_V2_NONE] = "Vulnerable", + [SPECTRE_V2_RETPOLINE_MINIMAL] = "Vulnerable: Minimal generic ASM retpoline", + [SPECTRE_V2_RETPOLINE_MINIMAL_AMD] = "Vulnerable: Minimal AMD ASM retpoline", + [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline", + [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline", +}; + +#undef pr_fmt +#define pr_fmt(fmt) "Spectre V2 mitigation: " fmt + +static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE; + +static void __init spec2_print_if_insecure(const char *reason) +{ + if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) + pr_info("%s\n", reason); +} + +static void __init spec2_print_if_secure(const char *reason) +{ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) + pr_info("%s\n", reason); +} + +static inline bool retp_compiler(void) +{ + return __is_defined(RETPOLINE); +} + +static inline bool match_option(const char *arg, int arglen, const char *opt) +{ + int len = strlen(opt); + + return len == arglen && !strncmp(arg, opt, len); +} + +static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) +{ + char arg[20]; + int ret; + + ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, + sizeof(arg)); + if (ret > 0) { + if (match_option(arg, ret, "off")) { + goto disable; + } else if (match_option(arg, ret, "on")) { + spec2_print_if_secure("force enabled on command line."); + return SPECTRE_V2_CMD_FORCE; + } else if (match_option(arg, ret, "retpoline")) { + spec2_print_if_insecure("retpoline selected on command line."); + return SPECTRE_V2_CMD_RETPOLINE; + } else if (match_option(arg, ret, "retpoline,amd")) { + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) { + pr_err("retpoline,amd selected but CPU is not AMD. Switching to AUTO select\n"); + return SPECTRE_V2_CMD_AUTO; + } + spec2_print_if_insecure("AMD retpoline selected on command line."); + return SPECTRE_V2_CMD_RETPOLINE_AMD; + } else if (match_option(arg, ret, "retpoline,generic")) { + spec2_print_if_insecure("generic retpoline selected on command line."); + return SPECTRE_V2_CMD_RETPOLINE_GENERIC; + } else if (match_option(arg, ret, "auto")) { + return SPECTRE_V2_CMD_AUTO; + } + } + + if (!cmdline_find_option_bool(boot_command_line, "nospectre_v2")) + return SPECTRE_V2_CMD_AUTO; +disable: + spec2_print_if_insecure("disabled on command line."); + return SPECTRE_V2_CMD_NONE; +} + +static void __init spectre_v2_select_mitigation(void) +{ + enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline(); + enum spectre_v2_mitigation mode = SPECTRE_V2_NONE; + + /* + * If the CPU is not affected and the command line mode is NONE or AUTO + * then nothing to do. + */ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2) && + (cmd == SPECTRE_V2_CMD_NONE || cmd == SPECTRE_V2_CMD_AUTO)) + return; + + switch (cmd) { + case SPECTRE_V2_CMD_NONE: + return; + + case SPECTRE_V2_CMD_FORCE: + /* FALLTRHU */ + case SPECTRE_V2_CMD_AUTO: + goto retpoline_auto; + + case SPECTRE_V2_CMD_RETPOLINE_AMD: + if (IS_ENABLED(CONFIG_RETPOLINE)) + goto retpoline_amd; + break; + case SPECTRE_V2_CMD_RETPOLINE_GENERIC: + if (IS_ENABLED(CONFIG_RETPOLINE)) + goto retpoline_generic; + break; + case SPECTRE_V2_CMD_RETPOLINE: + if (IS_ENABLED(CONFIG_RETPOLINE)) + goto retpoline_auto; + break; + } + pr_err("kernel not compiled with retpoline; no mitigation available!"); + return; + +retpoline_auto: + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { + retpoline_amd: + if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) { + pr_err("LFENCE not serializing. Switching to generic retpoline\n"); + goto retpoline_generic; + } + mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_AMD : + SPECTRE_V2_RETPOLINE_MINIMAL_AMD; + setup_force_cpu_cap(X86_FEATURE_RETPOLINE_AMD); + setup_force_cpu_cap(X86_FEATURE_RETPOLINE); + } else { + retpoline_generic: + mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_GENERIC : + SPECTRE_V2_RETPOLINE_MINIMAL; + setup_force_cpu_cap(X86_FEATURE_RETPOLINE); + } + + spectre_v2_enabled = mode; + pr_info("%s\n", spectre_v2_strings[mode]); +} + +#undef pr_fmt + #ifdef CONFIG_SYSFS ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf) @@ -85,6 +240,7 @@ ssize_t cpu_show_spectre_v2(struct devic { if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) return sprintf(buf, "Not affected\n"); - return sprintf(buf, "Vulnerable\n"); + + return sprintf(buf, "%s\n", spectre_v2_strings[spectre_v2_enabled]); } #endif --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -837,10 +837,6 @@ static void __init early_identify_cpu(st setup_force_cpu_bug(X86_BUG_SPECTRE_V1); setup_force_cpu_bug(X86_BUG_SPECTRE_V2); -#ifdef CONFIG_RETPOLINE - setup_force_cpu_cap(X86_FEATURE_RETPOLINE); -#endif - fpu__init_system(c); #ifdef CONFIG_X86_32 Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are queue-4.4/x86-spectre-add-boot-time-option-to-select-spectre-v2-mitigation.patch queue-4.4/x86-retpoline-irq32-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-hyperv-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-entry-convert-entry-assembler-indirect-jumps.patch queue-4.4/x86-cpu-amd-make-lfence-a-serializing-instruction.patch queue-4.4/x86-retpoline-ftrace-convert-ftrace-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-crypto-convert-crypto-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-xen-convert-xen-hypercall-indirect-jumps.patch queue-4.4/x86-retpoline-checksum32-convert-assembler-indirect-jumps.patch queue-4.4/x86-mm-32-move-setup_clear_cpu_cap-x86_feature_pcid-earlier.patch queue-4.4/x86-retpoline-fill-return-stack-buffer-on-vmexit.patch queue-4.4/x86-retpoline-remove-compile-time-warning.patch queue-4.4/x86-cpu-amd-use-lfence_rdtsc-in-preference-to-mfence_rdtsc.patch queue-4.4/x86-retpoline-add-initial-retpoline-support.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/retpoline/xen: Convert Xen hypercall indirect jumps" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/retpoline/xen: Convert Xen hypercall indirect jumps to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-retpoline-xen-convert-xen-hypercall-indirect-jumps.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From ea08816d5b185ab3d09e95e393f265af54560350 Mon Sep 17 00:00:00 2001 From: David Woodhouse <dwmw(a)amazon.co.uk> Date: Thu, 11 Jan 2018 21:46:31 +0000 Subject: x86/retpoline/xen: Convert Xen hypercall indirect jumps From: David Woodhouse <dwmw(a)amazon.co.uk> commit ea08816d5b185ab3d09e95e393f265af54560350 upstream. Convert indirect call in Xen hypercall to use non-speculative sequence, when CONFIG_RETPOLINE is enabled. Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Acked-by: Arjan van de Ven <arjan(a)linux.intel.com> Acked-by: Ingo Molnar <mingo(a)kernel.org> Reviewed-by: Juergen Gross <jgross(a)suse.com> Cc: gnomes(a)lxorguk.ukuu.org.uk Cc: Rik van Riel <riel(a)redhat.com> Cc: Andi Kleen <ak(a)linux.intel.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: thomas.lendacky(a)amd.com Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Jiri Kosina <jikos(a)kernel.org> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Kees Cook <keescook(a)google.com> Cc: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org> Cc: Paul Turner <pjt(a)google.com> Link: https://lkml.kernel.org/r/1515707194-20531-10-git-send-email-dwmw@amazon.co… Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/xen/hypercall.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- a/arch/x86/include/asm/xen/hypercall.h +++ b/arch/x86/include/asm/xen/hypercall.h @@ -44,6 +44,7 @@ #include <asm/page.h> #include <asm/pgtable.h> #include <asm/smap.h> +#include <asm/nospec-branch.h> #include <xen/interface/xen.h> #include <xen/interface/sched.h> @@ -215,9 +216,9 @@ privcmd_call(unsigned call, __HYPERCALL_5ARG(a1, a2, a3, a4, a5); stac(); - asm volatile("call *%[call]" + asm volatile(CALL_NOSPEC : __HYPERCALL_5PARAM - : [call] "a" (&hypercall_page[call]) + : [thunk_target] "a" (&hypercall_page[call]) : __HYPERCALL_CLOBBER5); clac(); Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are queue-4.4/x86-spectre-add-boot-time-option-to-select-spectre-v2-mitigation.patch queue-4.4/x86-retpoline-irq32-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-hyperv-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-entry-convert-entry-assembler-indirect-jumps.patch queue-4.4/x86-cpu-amd-make-lfence-a-serializing-instruction.patch queue-4.4/x86-retpoline-ftrace-convert-ftrace-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-crypto-convert-crypto-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-xen-convert-xen-hypercall-indirect-jumps.patch queue-4.4/x86-retpoline-checksum32-convert-assembler-indirect-jumps.patch queue-4.4/x86-mm-32-move-setup_clear_cpu_cap-x86_feature_pcid-earlier.patch queue-4.4/x86-retpoline-fill-return-stack-buffer-on-vmexit.patch queue-4.4/x86-retpoline-remove-compile-time-warning.patch queue-4.4/x86-cpu-amd-use-lfence_rdtsc-in-preference-to-mfence_rdtsc.patch queue-4.4/x86-retpoline-add-initial-retpoline-support.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/retpoline: Remove compile time warning" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/retpoline: Remove compile time warning to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-retpoline-remove-compile-time-warning.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From b8b9ce4b5aec8de9e23cabb0a26b78641f9ab1d6 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner <tglx(a)linutronix.de> Date: Sun, 14 Jan 2018 22:13:29 +0100 Subject: x86/retpoline: Remove compile time warning From: Thomas Gleixner <tglx(a)linutronix.de> commit b8b9ce4b5aec8de9e23cabb0a26b78641f9ab1d6 upstream. Remove the compile time warning when CONFIG_RETPOLINE=y and the compiler does not have retpoline support. Linus rationale for this is: It's wrong because it will just make people turn off RETPOLINE, and the asm updates - and return stack clearing - that are independent of the compiler are likely the most important parts because they are likely the ones easiest to target. And it's annoying because most people won't be able to do anything about it. The number of people building their own compiler? Very small. So if their distro hasn't got a compiler yet (and pretty much nobody does), the warning is just annoying crap. It is already properly reported as part of the sysfs interface. The compile-time warning only encourages bad things. Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support") Requested-by: Linus Torvalds <torvalds(a)linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: David Woodhouse <dwmw(a)amazon.co.uk> Cc: Peter Zijlstra (Intel) <peterz(a)infradead.org> Cc: gnomes(a)lxorguk.ukuu.org.uk Cc: Rik van Riel <riel(a)redhat.com> Cc: Andi Kleen <ak(a)linux.intel.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: thomas.lendacky(a)amd.com Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Jiri Kosina <jikos(a)kernel.org> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Kees Cook <keescook(a)google.com> Cc: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org> Link: https://lkml.kernel.org/r/CA+55aFzWgquv4i6Mab6bASqYXg3ErV3XDFEYf=GEcCDQg5uA… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/Makefile | 2 -- 1 file changed, 2 deletions(-) --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -194,8 +194,6 @@ ifdef CONFIG_RETPOLINE RETPOLINE_CFLAGS += $(call cc-option,-mindirect-branch=thunk-extern -mindirect-branch-register) ifneq ($(RETPOLINE_CFLAGS),) KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE - else - $(warning CONFIG_RETPOLINE=y, but not supported by the compiler. Toolchain update recommended.) endif endif Patches currently in stable-queue which might be from tglx(a)linutronix.de are queue-4.4/x86-spectre-add-boot-time-option-to-select-spectre-v2-mitigation.patch queue-4.4/x86-retpoline-irq32-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-hyperv-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-entry-convert-entry-assembler-indirect-jumps.patch queue-4.4/x86-asm-use-register-variable-to-get-stack-pointer-value.patch queue-4.4/x86-cpu-amd-make-lfence-a-serializing-instruction.patch queue-4.4/x86-retpoline-ftrace-convert-ftrace-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-crypto-convert-crypto-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-xen-convert-xen-hypercall-indirect-jumps.patch queue-4.4/x86-retpoline-checksum32-convert-assembler-indirect-jumps.patch queue-4.4/x86-mm-32-move-setup_clear_cpu_cap-x86_feature_pcid-earlier.patch queue-4.4/x86-retpoline-fill-return-stack-buffer-on-vmexit.patch queue-4.4/x86-retpoline-remove-compile-time-warning.patch queue-4.4/x86-cpu-amd-use-lfence_rdtsc-in-preference-to-mfence_rdtsc.patch queue-4.4/x86-retpoline-add-initial-retpoline-support.patch queue-4.4/x86-asm-make-asm-alternative.h-safe-from-assembly.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/retpoline/irq32: Convert assembler indirect jumps" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/retpoline/irq32: Convert assembler indirect jumps to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-retpoline-irq32-convert-assembler-indirect-jumps.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 7614e913db1f40fff819b36216484dc3808995d4 Mon Sep 17 00:00:00 2001 From: Andi Kleen <ak(a)linux.intel.com> Date: Thu, 11 Jan 2018 21:46:33 +0000 Subject: x86/retpoline/irq32: Convert assembler indirect jumps From: Andi Kleen <ak(a)linux.intel.com> commit 7614e913db1f40fff819b36216484dc3808995d4 upstream. Convert all indirect jumps in 32bit irq inline asm code to use non speculative sequences. Signed-off-by: Andi Kleen <ak(a)linux.intel.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Acked-by: Arjan van de Ven <arjan(a)linux.intel.com> Acked-by: Ingo Molnar <mingo(a)kernel.org> Cc: gnomes(a)lxorguk.ukuu.org.uk Cc: Rik van Riel <riel(a)redhat.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: thomas.lendacky(a)amd.com Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Jiri Kosina <jikos(a)kernel.org> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Kees Cook <keescook(a)google.com> Cc: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org> Cc: Paul Turner <pjt(a)google.com> Link: https://lkml.kernel.org/r/1515707194-20531-12-git-send-email-dwmw@amazon.co… Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Razvan Ghitulete <rga(a)amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/kernel/irq_32.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) --- a/arch/x86/kernel/irq_32.c +++ b/arch/x86/kernel/irq_32.c @@ -20,6 +20,7 @@ #include <linux/mm.h> #include <asm/apic.h> +#include <asm/nospec-branch.h> #ifdef CONFIG_DEBUG_STACKOVERFLOW @@ -55,11 +56,11 @@ DEFINE_PER_CPU(struct irq_stack *, softi static void call_on_stack(void *func, void *stack) { asm volatile("xchgl %%ebx,%%esp \n" - "call *%%edi \n" + CALL_NOSPEC "movl %%ebx,%%esp \n" : "=b" (stack) : "0" (stack), - "D"(func) + [thunk_target] "D"(func) : "memory", "cc", "edx", "ecx", "eax"); } @@ -95,11 +96,11 @@ static inline int execute_on_irq_stack(i call_on_stack(print_stack_overflow, isp); asm volatile("xchgl %%ebx,%%esp \n" - "call *%%edi \n" + CALL_NOSPEC "movl %%ebx,%%esp \n" : "=a" (arg1), "=b" (isp) : "0" (desc), "1" (isp), - "D" (desc->handle_irq) + [thunk_target] "D" (desc->handle_irq) : "memory", "cc", "ecx"); return 1; } Patches currently in stable-queue which might be from ak(a)linux.intel.com are queue-4.4/x86-spectre-add-boot-time-option-to-select-spectre-v2-mitigation.patch queue-4.4/x86-retpoline-irq32-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-hyperv-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-entry-convert-entry-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-ftrace-convert-ftrace-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-crypto-convert-crypto-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-xen-convert-xen-hypercall-indirect-jumps.patch queue-4.4/x86-retpoline-checksum32-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-fill-return-stack-buffer-on-vmexit.patch queue-4.4/x86-retpoline-remove-compile-time-warning.patch queue-4.4/x86-retpoline-add-initial-retpoline-support.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/retpoline/hyperv: Convert assembler indirect jumps" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/retpoline/hyperv: Convert assembler indirect jumps to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-retpoline-hyperv-convert-assembler-indirect-jumps.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From e70e5892b28c18f517f29ab6e83bd57705104b31 Mon Sep 17 00:00:00 2001 From: David Woodhouse <dwmw(a)amazon.co.uk> Date: Thu, 11 Jan 2018 21:46:30 +0000 Subject: x86/retpoline/hyperv: Convert assembler indirect jumps From: David Woodhouse <dwmw(a)amazon.co.uk> commit e70e5892b28c18f517f29ab6e83bd57705104b31 upstream. Convert all indirect jumps in hyperv inline asm code to use non-speculative sequences when CONFIG_RETPOLINE is enabled. Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Acked-by: Arjan van de Ven <arjan(a)linux.intel.com> Acked-by: Ingo Molnar <mingo(a)kernel.org> Cc: gnomes(a)lxorguk.ukuu.org.uk Cc: Rik van Riel <riel(a)redhat.com> Cc: Andi Kleen <ak(a)linux.intel.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: thomas.lendacky(a)amd.com Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Jiri Kosina <jikos(a)kernel.org> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Kees Cook <keescook(a)google.com> Cc: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org> Cc: Paul Turner <pjt(a)google.com> Link: https://lkml.kernel.org/r/1515707194-20531-9-git-send-email-dwmw@amazon.co.… [ backport to 4.4, hopefully correct, not tested... - gregkh ] Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/hv/hv.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -31,6 +31,7 @@ #include <linux/clockchips.h> #include <asm/hyperv.h> #include <asm/mshyperv.h> +#include <asm/nospec-branch.h> #include "hyperv_vmbus.h" /* The one and only */ @@ -103,9 +104,10 @@ static u64 do_hypercall(u64 control, voi return (u64)ULLONG_MAX; __asm__ __volatile__("mov %0, %%r8" : : "r" (output_address) : "r8"); - __asm__ __volatile__("call *%3" : "=a" (hv_status) : + __asm__ __volatile__(CALL_NOSPEC : + "=a" (hv_status) : "c" (control), "d" (input_address), - "m" (hypercall_page)); + THUNK_TARGET(hypercall_page)); return hv_status; @@ -123,11 +125,12 @@ static u64 do_hypercall(u64 control, voi if (!hypercall_page) return (u64)ULLONG_MAX; - __asm__ __volatile__ ("call *%8" : "=d"(hv_status_hi), + __asm__ __volatile__ (CALL_NOSPEC : "=d"(hv_status_hi), "=a"(hv_status_lo) : "d" (control_hi), "a" (control_lo), "b" (input_address_hi), "c" (input_address_lo), "D"(output_address_hi), - "S"(output_address_lo), "m" (hypercall_page)); + "S"(output_address_lo), + THUNK_TARGET(hypercall_page)); return hv_status_lo | ((u64)hv_status_hi << 32); #endif /* !x86_64 */ Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are queue-4.4/x86-spectre-add-boot-time-option-to-select-spectre-v2-mitigation.patch queue-4.4/x86-retpoline-irq32-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-hyperv-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-entry-convert-entry-assembler-indirect-jumps.patch queue-4.4/x86-cpu-amd-make-lfence-a-serializing-instruction.patch queue-4.4/x86-retpoline-ftrace-convert-ftrace-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-crypto-convert-crypto-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-xen-convert-xen-hypercall-indirect-jumps.patch queue-4.4/x86-retpoline-checksum32-convert-assembler-indirect-jumps.patch queue-4.4/x86-mm-32-move-setup_clear_cpu_cap-x86_feature_pcid-earlier.patch queue-4.4/x86-retpoline-fill-return-stack-buffer-on-vmexit.patch queue-4.4/x86-retpoline-remove-compile-time-warning.patch queue-4.4/x86-cpu-amd-use-lfence_rdtsc-in-preference-to-mfence_rdtsc.patch queue-4.4/x86-retpoline-add-initial-retpoline-support.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/retpoline/ftrace: Convert ftrace assembler indirect jumps" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/retpoline/ftrace: Convert ftrace assembler indirect jumps to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-retpoline-ftrace-convert-ftrace-assembler-indirect-jumps.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 9351803bd803cdbeb9b5a7850b7b6f464806e3db Mon Sep 17 00:00:00 2001 From: David Woodhouse <dwmw(a)amazon.co.uk> Date: Thu, 11 Jan 2018 21:46:29 +0000 Subject: x86/retpoline/ftrace: Convert ftrace assembler indirect jumps From: David Woodhouse <dwmw(a)amazon.co.uk> commit 9351803bd803cdbeb9b5a7850b7b6f464806e3db upstream. Convert all indirect jumps in ftrace assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled. Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Acked-by: Arjan van de Ven <arjan(a)linux.intel.com> Acked-by: Ingo Molnar <mingo(a)kernel.org> Cc: gnomes(a)lxorguk.ukuu.org.uk Cc: Rik van Riel <riel(a)redhat.com> Cc: Andi Kleen <ak(a)linux.intel.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: thomas.lendacky(a)amd.com Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Jiri Kosina <jikos(a)kernel.org> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Kees Cook <keescook(a)google.com> Cc: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org> Cc: Paul Turner <pjt(a)google.com> Link: https://lkml.kernel.org/r/1515707194-20531-8-git-send-email-dwmw@amazon.co.… Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Razvan Ghitulete <rga(a)amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/entry/entry_32.S | 5 +++-- arch/x86/kernel/mcount_64.S | 7 ++++--- 2 files changed, 7 insertions(+), 5 deletions(-) --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -862,7 +862,8 @@ trace: movl 0x4(%ebp), %edx subl $MCOUNT_INSN_SIZE, %eax - call *ftrace_trace_function + movl ftrace_trace_function, %ecx + CALL_NOSPEC %ecx popl %edx popl %ecx @@ -897,7 +898,7 @@ return_to_handler: movl %eax, %ecx popl %edx popl %eax - jmp *%ecx + JMP_NOSPEC %ecx #endif #ifdef CONFIG_TRACING --- a/arch/x86/kernel/mcount_64.S +++ b/arch/x86/kernel/mcount_64.S @@ -7,7 +7,7 @@ #include <linux/linkage.h> #include <asm/ptrace.h> #include <asm/ftrace.h> - +#include <asm/nospec-branch.h> .code64 .section .entry.text, "ax" @@ -285,8 +285,9 @@ trace: * ip and parent ip are used and the list function is called when * function tracing is enabled. */ - call *ftrace_trace_function + movq ftrace_trace_function, %r8 + CALL_NOSPEC %r8 restore_mcount_regs jmp fgraph_trace @@ -329,5 +330,5 @@ GLOBAL(return_to_handler) movq 8(%rsp), %rdx movq (%rsp), %rax addq $24, %rsp - jmp *%rdi + JMP_NOSPEC %rdi #endif Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are queue-4.4/x86-spectre-add-boot-time-option-to-select-spectre-v2-mitigation.patch queue-4.4/x86-retpoline-irq32-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-hyperv-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-entry-convert-entry-assembler-indirect-jumps.patch queue-4.4/x86-cpu-amd-make-lfence-a-serializing-instruction.patch queue-4.4/x86-retpoline-ftrace-convert-ftrace-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-crypto-convert-crypto-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-xen-convert-xen-hypercall-indirect-jumps.patch queue-4.4/x86-retpoline-checksum32-convert-assembler-indirect-jumps.patch queue-4.4/x86-mm-32-move-setup_clear_cpu_cap-x86_feature_pcid-earlier.patch queue-4.4/x86-retpoline-fill-return-stack-buffer-on-vmexit.patch queue-4.4/x86-retpoline-remove-compile-time-warning.patch queue-4.4/x86-cpu-amd-use-lfence_rdtsc-in-preference-to-mfence_rdtsc.patch queue-4.4/x86-retpoline-add-initial-retpoline-support.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/retpoline: Fill return stack buffer on vmexit" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/retpoline: Fill return stack buffer on vmexit to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-retpoline-fill-return-stack-buffer-on-vmexit.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 117cc7a908c83697b0b737d15ae1eb5943afe35b Mon Sep 17 00:00:00 2001 From: David Woodhouse <dwmw(a)amazon.co.uk> Date: Fri, 12 Jan 2018 11:11:27 +0000 Subject: x86/retpoline: Fill return stack buffer on vmexit From: David Woodhouse <dwmw(a)amazon.co.uk> commit 117cc7a908c83697b0b737d15ae1eb5943afe35b upstream. In accordance with the Intel and AMD documentation, we need to overwrite all entries in the RSB on exiting a guest, to prevent malicious branch target predictions from affecting the host kernel. This is needed both for retpoline and for IBRS. [ak: numbers again for the RSB stuffing labels] Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Tested-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Cc: gnomes(a)lxorguk.ukuu.org.uk Cc: Rik van Riel <riel(a)redhat.com> Cc: Andi Kleen <ak(a)linux.intel.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: thomas.lendacky(a)amd.com Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Jiri Kosina <jikos(a)kernel.org> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Kees Cook <keescook(a)google.com> Cc: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org> Cc: Paul Turner <pjt(a)google.com> Link: https://lkml.kernel.org/r/1515755487-8524-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Razvan Ghitulete <rga(a)amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/nospec-branch.h | 76 ++++++++++++++++++++++++++++++++++- arch/x86/kvm/svm.c | 4 + arch/x86/kvm/vmx.c | 4 + 3 files changed, 83 insertions(+), 1 deletion(-) --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -7,6 +7,48 @@ #include <asm/alternative-asm.h> #include <asm/cpufeature.h> +/* + * Fill the CPU return stack buffer. + * + * Each entry in the RSB, if used for a speculative 'ret', contains an + * infinite 'pause; jmp' loop to capture speculative execution. + * + * This is required in various cases for retpoline and IBRS-based + * mitigations for the Spectre variant 2 vulnerability. Sometimes to + * eliminate potentially bogus entries from the RSB, and sometimes + * purely to ensure that it doesn't get empty, which on some CPUs would + * allow predictions from other (unwanted!) sources to be used. + * + * We define a CPP macro such that it can be used from both .S files and + * inline assembly. It's possible to do a .macro and then include that + * from C via asm(".include <asm/nospec-branch.h>") but let's not go there. + */ + +#define RSB_CLEAR_LOOPS 32 /* To forcibly overwrite all entries */ +#define RSB_FILL_LOOPS 16 /* To avoid underflow */ + +/* + * Google experimented with loop-unrolling and this turned out to be + * the optimal version — two calls, each with their own speculation + * trap should their return address end up getting used, in a loop. + */ +#define __FILL_RETURN_BUFFER(reg, nr, sp) \ + mov $(nr/2), reg; \ +771: \ + call 772f; \ +773: /* speculation trap */ \ + pause; \ + jmp 773b; \ +772: \ + call 774f; \ +775: /* speculation trap */ \ + pause; \ + jmp 775b; \ +774: \ + dec reg; \ + jnz 771b; \ + add $(BITS_PER_LONG/8) * nr, sp; + #ifdef __ASSEMBLY__ /* @@ -61,6 +103,19 @@ #endif .endm + /* + * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP + * monstrosity above, manually. + */ +.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req +#ifdef CONFIG_RETPOLINE + ALTERNATIVE "jmp .Lskip_rsb_\@", \ + __stringify(__FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)) \ + \ftr +.Lskip_rsb_\@: +#endif +.endm + #else /* __ASSEMBLY__ */ #if defined(CONFIG_X86_64) && defined(RETPOLINE) @@ -97,7 +152,7 @@ X86_FEATURE_RETPOLINE) # define THUNK_TARGET(addr) [thunk_target] "rm" (addr) -#else /* No retpoline */ +#else /* No retpoline for C / inline asm */ # define CALL_NOSPEC "call *%[thunk_target]\n" # define THUNK_TARGET(addr) [thunk_target] "rm" (addr) #endif @@ -112,5 +167,24 @@ enum spectre_v2_mitigation { SPECTRE_V2_IBRS, }; +/* + * On VMEXIT we must ensure that no RSB predictions learned in the guest + * can be followed in the host, by overwriting the RSB completely. Both + * retpoline and IBRS mitigations for Spectre v2 need this; only on future + * CPUs with IBRS_ATT *might* it be avoided. + */ +static inline void vmexit_fill_RSB(void) +{ +#ifdef CONFIG_RETPOLINE + unsigned long loops = RSB_CLEAR_LOOPS / 2; + + asm volatile (ALTERNATIVE("jmp 910f", + __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)), + X86_FEATURE_RETPOLINE) + "910:" + : "=&r" (loops), ASM_CALL_CONSTRAINT + : "r" (loops) : "memory" ); +#endif +} #endif /* __ASSEMBLY__ */ #endif /* __NOSPEC_BRANCH_H__ */ --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -37,6 +37,7 @@ #include <asm/desc.h> #include <asm/debugreg.h> #include <asm/kvm_para.h> +#include <asm/nospec-branch.h> #include <asm/virtext.h> #include "trace.h" @@ -3904,6 +3905,9 @@ static void svm_vcpu_run(struct kvm_vcpu #endif ); + /* Eliminate branch target predictions from guest mode */ + vmexit_fill_RSB(); + #ifdef CONFIG_X86_64 wrmsrl(MSR_GS_BASE, svm->host.gs_base); #else --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -47,6 +47,7 @@ #include <asm/kexec.h> #include <asm/apic.h> #include <asm/irq_remapping.h> +#include <asm/nospec-branch.h> #include "trace.h" #include "pmu.h" @@ -8701,6 +8702,9 @@ static void __noclone vmx_vcpu_run(struc #endif ); + /* Eliminate branch target predictions from guest mode */ + vmexit_fill_RSB(); + /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */ if (debugctlmsr) update_debugctlmsr(debugctlmsr); Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are queue-4.4/x86-spectre-add-boot-time-option-to-select-spectre-v2-mitigation.patch queue-4.4/x86-retpoline-irq32-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-hyperv-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-entry-convert-entry-assembler-indirect-jumps.patch queue-4.4/x86-cpu-amd-make-lfence-a-serializing-instruction.patch queue-4.4/x86-retpoline-ftrace-convert-ftrace-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-crypto-convert-crypto-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-xen-convert-xen-hypercall-indirect-jumps.patch queue-4.4/x86-retpoline-checksum32-convert-assembler-indirect-jumps.patch queue-4.4/x86-mm-32-move-setup_clear_cpu_cap-x86_feature_pcid-earlier.patch queue-4.4/x86-retpoline-fill-return-stack-buffer-on-vmexit.patch queue-4.4/x86-retpoline-remove-compile-time-warning.patch queue-4.4/x86-cpu-amd-use-lfence_rdtsc-in-preference-to-mfence_rdtsc.patch queue-4.4/x86-retpoline-add-initial-retpoline-support.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/retpoline/entry: Convert entry assembler indirect jumps" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/retpoline/entry: Convert entry assembler indirect jumps to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-retpoline-entry-convert-entry-assembler-indirect-jumps.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 2641f08bb7fc63a636a2b18173221d7040a3512e Mon Sep 17 00:00:00 2001 From: David Woodhouse <dwmw(a)amazon.co.uk> Date: Thu, 11 Jan 2018 21:46:28 +0000 Subject: x86/retpoline/entry: Convert entry assembler indirect jumps From: David Woodhouse <dwmw(a)amazon.co.uk> commit 2641f08bb7fc63a636a2b18173221d7040a3512e upstream. Convert indirect jumps in core 32/64bit entry assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled. Don't use CALL_NOSPEC in entry_SYSCALL_64_fastpath because the return address after the 'call' instruction must be *precisely* at the .Lentry_SYSCALL_64_after_fastpath label for stub_ptregs_64 to work, and the use of alternatives will mess that up unless we play horrid games to prepend with NOPs and make the variants the same length. It's not worth it; in the case where we ALTERNATIVE out the retpoline, the first instruction at __x86.indirect_thunk.rax is going to be a bare jmp *%rax anyway. Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Acked-by: Ingo Molnar <mingo(a)kernel.org> Acked-by: Arjan van de Ven <arjan(a)linux.intel.com> Cc: gnomes(a)lxorguk.ukuu.org.uk Cc: Rik van Riel <riel(a)redhat.com> Cc: Andi Kleen <ak(a)linux.intel.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: thomas.lendacky(a)amd.com Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Jiri Kosina <jikos(a)kernel.org> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Kees Cook <keescook(a)google.com> Cc: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org> Cc: Paul Turner <pjt(a)google.com> Link: https://lkml.kernel.org/r/1515707194-20531-7-git-send-email-dwmw@amazon.co.… Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Razvan Ghitulete <rga(a)amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/entry/entry_32.S | 5 +++-- arch/x86/entry/entry_64.S | 14 +++++++++++++- 2 files changed, 16 insertions(+), 3 deletions(-) --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -44,6 +44,7 @@ #include <asm/alternative-asm.h> #include <asm/asm.h> #include <asm/smap.h> +#include <asm/nospec-branch.h> .section .entry.text, "ax" @@ -226,7 +227,7 @@ ENTRY(ret_from_kernel_thread) pushl $0x0202 # Reset kernel eflags popfl movl PT_EBP(%esp), %eax - call *PT_EBX(%esp) + CALL_NOSPEC PT_EBX(%esp) movl $0, PT_EAX(%esp) /* @@ -938,7 +939,7 @@ error_code: movl %ecx, %es TRACE_IRQS_OFF movl %esp, %eax # pt_regs pointer - call *%edi + CALL_NOSPEC %edi jmp ret_from_exception END(page_fault) --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -36,6 +36,7 @@ #include <asm/smap.h> #include <asm/pgtable_types.h> #include <asm/kaiser.h> +#include <asm/nospec-branch.h> #include <linux/err.h> /* Avoid __ASSEMBLER__'ifying <linux/audit.h> just for this. */ @@ -184,7 +185,13 @@ entry_SYSCALL_64_fastpath: #endif ja 1f /* return -ENOSYS (already in pt_regs->ax) */ movq %r10, %rcx +#ifdef CONFIG_RETPOLINE + movq sys_call_table(, %rax, 8), %rax + call __x86_indirect_thunk_rax +#else call *sys_call_table(, %rax, 8) +#endif + movq %rax, RAX(%rsp) 1: /* @@ -276,7 +283,12 @@ tracesys_phase2: #endif ja 1f /* return -ENOSYS (already in pt_regs->ax) */ movq %r10, %rcx /* fixup for C */ +#ifdef CONFIG_RETPOLINE + movq sys_call_table(, %rax, 8), %rax + call __x86_indirect_thunk_rax +#else call *sys_call_table(, %rax, 8) +#endif movq %rax, RAX(%rsp) 1: /* Use IRET because user could have changed pt_regs->foo */ @@ -491,7 +503,7 @@ ENTRY(ret_from_fork) * nb: we depend on RESTORE_EXTRA_REGS above */ movq %rbp, %rdi - call *%rbx + CALL_NOSPEC %rbx movl $0, RAX(%rsp) RESTORE_EXTRA_REGS jmp int_ret_from_sys_call Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are queue-4.4/x86-spectre-add-boot-time-option-to-select-spectre-v2-mitigation.patch queue-4.4/x86-retpoline-irq32-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-hyperv-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-entry-convert-entry-assembler-indirect-jumps.patch queue-4.4/x86-cpu-amd-make-lfence-a-serializing-instruction.patch queue-4.4/x86-retpoline-ftrace-convert-ftrace-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-crypto-convert-crypto-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-xen-convert-xen-hypercall-indirect-jumps.patch queue-4.4/x86-retpoline-checksum32-convert-assembler-indirect-jumps.patch queue-4.4/x86-mm-32-move-setup_clear_cpu_cap-x86_feature_pcid-earlier.patch queue-4.4/x86-retpoline-fill-return-stack-buffer-on-vmexit.patch queue-4.4/x86-retpoline-remove-compile-time-warning.patch queue-4.4/x86-cpu-amd-use-lfence_rdtsc-in-preference-to-mfence_rdtsc.patch queue-4.4/x86-retpoline-add-initial-retpoline-support.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/retpoline/crypto: Convert crypto assembler indirect jumps" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/retpoline/crypto: Convert crypto assembler indirect jumps to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-retpoline-crypto-convert-crypto-assembler-indirect-jumps.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 9697fa39efd3fc3692f2949d4045f393ec58450b Mon Sep 17 00:00:00 2001 From: David Woodhouse <dwmw(a)amazon.co.uk> Date: Thu, 11 Jan 2018 21:46:27 +0000 Subject: x86/retpoline/crypto: Convert crypto assembler indirect jumps From: David Woodhouse <dwmw(a)amazon.co.uk> commit 9697fa39efd3fc3692f2949d4045f393ec58450b upstream. Convert all indirect jumps in crypto assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled. Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Acked-by: Arjan van de Ven <arjan(a)linux.intel.com> Acked-by: Ingo Molnar <mingo(a)kernel.org> Cc: gnomes(a)lxorguk.ukuu.org.uk Cc: Rik van Riel <riel(a)redhat.com> Cc: Andi Kleen <ak(a)linux.intel.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: thomas.lendacky(a)amd.com Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Jiri Kosina <jikos(a)kernel.org> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Kees Cook <keescook(a)google.com> Cc: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org> Cc: Paul Turner <pjt(a)google.com> Link: https://lkml.kernel.org/r/1515707194-20531-6-git-send-email-dwmw@amazon.co.… Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/crypto/aesni-intel_asm.S | 5 +++-- arch/x86/crypto/camellia-aesni-avx-asm_64.S | 3 ++- arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 3 ++- arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 3 ++- 4 files changed, 9 insertions(+), 5 deletions(-) --- a/arch/x86/crypto/aesni-intel_asm.S +++ b/arch/x86/crypto/aesni-intel_asm.S @@ -31,6 +31,7 @@ #include <linux/linkage.h> #include <asm/inst.h> +#include <asm/nospec-branch.h> /* * The following macros are used to move an (un)aligned 16 byte value to/from @@ -2714,7 +2715,7 @@ ENTRY(aesni_xts_crypt8) pxor INC, STATE4 movdqu IV, 0x30(OUTP) - call *%r11 + CALL_NOSPEC %r11 movdqu 0x00(OUTP), INC pxor INC, STATE1 @@ -2759,7 +2760,7 @@ ENTRY(aesni_xts_crypt8) _aesni_gf128mul_x_ble() movups IV, (IVP) - call *%r11 + CALL_NOSPEC %r11 movdqu 0x40(OUTP), INC pxor INC, STATE1 --- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S +++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S @@ -16,6 +16,7 @@ */ #include <linux/linkage.h> +#include <asm/nospec-branch.h> #define CAMELLIA_TABLE_BYTE_LEN 272 @@ -1210,7 +1211,7 @@ camellia_xts_crypt_16way: vpxor 14 * 16(%rax), %xmm15, %xmm14; vpxor 15 * 16(%rax), %xmm15, %xmm15; - call *%r9; + CALL_NOSPEC %r9; addq $(16 * 16), %rsp; --- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S +++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S @@ -11,6 +11,7 @@ */ #include <linux/linkage.h> +#include <asm/nospec-branch.h> #define CAMELLIA_TABLE_BYTE_LEN 272 @@ -1323,7 +1324,7 @@ camellia_xts_crypt_32way: vpxor 14 * 32(%rax), %ymm15, %ymm14; vpxor 15 * 32(%rax), %ymm15, %ymm15; - call *%r9; + CALL_NOSPEC %r9; addq $(16 * 32), %rsp; --- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S +++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S @@ -45,6 +45,7 @@ #include <asm/inst.h> #include <linux/linkage.h> +#include <asm/nospec-branch.h> ## ISCSI CRC 32 Implementation with crc32 and pclmulqdq Instruction @@ -172,7 +173,7 @@ continue_block: movzxw (bufp, %rax, 2), len offset=crc_array-jump_table lea offset(bufp, len, 1), bufp - jmp *bufp + JMP_NOSPEC bufp ################################################################ ## 2a) PROCESS FULL BLOCKS: Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are queue-4.4/x86-spectre-add-boot-time-option-to-select-spectre-v2-mitigation.patch queue-4.4/x86-retpoline-irq32-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-hyperv-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-entry-convert-entry-assembler-indirect-jumps.patch queue-4.4/x86-cpu-amd-make-lfence-a-serializing-instruction.patch queue-4.4/x86-retpoline-ftrace-convert-ftrace-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-crypto-convert-crypto-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-xen-convert-xen-hypercall-indirect-jumps.patch queue-4.4/x86-retpoline-checksum32-convert-assembler-indirect-jumps.patch queue-4.4/x86-mm-32-move-setup_clear_cpu_cap-x86_feature_pcid-earlier.patch queue-4.4/x86-retpoline-fill-return-stack-buffer-on-vmexit.patch queue-4.4/x86-retpoline-remove-compile-time-warning.patch queue-4.4/x86-cpu-amd-use-lfence_rdtsc-in-preference-to-mfence_rdtsc.patch queue-4.4/x86-retpoline-add-initial-retpoline-support.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/retpoline/checksum32: Convert assembler indirect jumps" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/retpoline/checksum32: Convert assembler indirect jumps to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-retpoline-checksum32-convert-assembler-indirect-jumps.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 5096732f6f695001fa2d6f1335a2680b37912c69 Mon Sep 17 00:00:00 2001 From: David Woodhouse <dwmw(a)amazon.co.uk> Date: Thu, 11 Jan 2018 21:46:32 +0000 Subject: x86/retpoline/checksum32: Convert assembler indirect jumps From: David Woodhouse <dwmw(a)amazon.co.uk> commit 5096732f6f695001fa2d6f1335a2680b37912c69 upstream. Convert all indirect jumps in 32bit checksum assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled. Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Acked-by: Arjan van de Ven <arjan(a)linux.intel.com> Acked-by: Ingo Molnar <mingo(a)kernel.org> Cc: gnomes(a)lxorguk.ukuu.org.uk Cc: Rik van Riel <riel(a)redhat.com> Cc: Andi Kleen <ak(a)linux.intel.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: thomas.lendacky(a)amd.com Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Jiri Kosina <jikos(a)kernel.org> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Kees Cook <keescook(a)google.com> Cc: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org> Cc: Paul Turner <pjt(a)google.com> Link: https://lkml.kernel.org/r/1515707194-20531-11-git-send-email-dwmw@amazon.co… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/lib/checksum_32.S | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) --- a/arch/x86/lib/checksum_32.S +++ b/arch/x86/lib/checksum_32.S @@ -28,7 +28,8 @@ #include <linux/linkage.h> #include <asm/errno.h> #include <asm/asm.h> - +#include <asm/nospec-branch.h> + /* * computes a partial checksum, e.g. for TCP/UDP fragments */ @@ -155,7 +156,7 @@ ENTRY(csum_partial) negl %ebx lea 45f(%ebx,%ebx,2), %ebx testl %esi, %esi - jmp *%ebx + JMP_NOSPEC %ebx # Handle 2-byte-aligned regions 20: addw (%esi), %ax @@ -437,7 +438,7 @@ ENTRY(csum_partial_copy_generic) andl $-32,%edx lea 3f(%ebx,%ebx), %ebx testl %esi, %esi - jmp *%ebx + JMP_NOSPEC %ebx 1: addl $64,%esi addl $64,%edi SRC(movb -32(%edx),%bl) ; SRC(movb (%edx),%bl) Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are queue-4.4/x86-spectre-add-boot-time-option-to-select-spectre-v2-mitigation.patch queue-4.4/x86-retpoline-irq32-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-hyperv-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-entry-convert-entry-assembler-indirect-jumps.patch queue-4.4/x86-cpu-amd-make-lfence-a-serializing-instruction.patch queue-4.4/x86-retpoline-ftrace-convert-ftrace-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-crypto-convert-crypto-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-xen-convert-xen-hypercall-indirect-jumps.patch queue-4.4/x86-retpoline-checksum32-convert-assembler-indirect-jumps.patch queue-4.4/x86-mm-32-move-setup_clear_cpu_cap-x86_feature_pcid-earlier.patch queue-4.4/x86-retpoline-fill-return-stack-buffer-on-vmexit.patch queue-4.4/x86-retpoline-remove-compile-time-warning.patch queue-4.4/x86-cpu-amd-use-lfence_rdtsc-in-preference-to-mfence_rdtsc.patch queue-4.4/x86-retpoline-add-initial-retpoline-support.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/retpoline: Add initial retpoline support" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/retpoline: Add initial retpoline support to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-retpoline-add-initial-retpoline-support.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 76b043848fd22dbf7f8bf3a1452f8c70d557b860 Mon Sep 17 00:00:00 2001 From: David Woodhouse <dwmw(a)amazon.co.uk> Date: Thu, 11 Jan 2018 21:46:25 +0000 Subject: x86/retpoline: Add initial retpoline support From: David Woodhouse <dwmw(a)amazon.co.uk> commit 76b043848fd22dbf7f8bf3a1452f8c70d557b860 upstream. Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide the corresponding thunks. Provide assembler macros for invoking the thunks in the same way that GCC does, from native and inline assembler. This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In some circumstances, IBRS microcode features may be used instead, and the retpoline can be disabled. On AMD CPUs if lfence is serialising, the retpoline can be dramatically simplified to a simple "lfence; jmp *\reg". A future patch, after it has been verified that lfence really is serialising in all circumstances, can enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition to X86_FEATURE_RETPOLINE. Do not align the retpoline in the altinstr section, because there is no guarantee that it stays aligned when it's copied over the oldinstr during alternative patching. [ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks] [ tglx: Put actual function CALL/JMP in front of the macros, convert to symbolic labels ] [ dwmw2: Convert back to numeric labels, merge objtool fixes ] Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Acked-by: Arjan van de Ven <arjan(a)linux.intel.com> Acked-by: Ingo Molnar <mingo(a)kernel.org> Cc: gnomes(a)lxorguk.ukuu.org.uk Cc: Rik van Riel <riel(a)redhat.com> Cc: Andi Kleen <ak(a)linux.intel.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: thomas.lendacky(a)amd.com Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Jiri Kosina <jikos(a)kernel.org> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Kees Cook <keescook(a)google.com> Cc: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org> Cc: Paul Turner <pjt(a)google.com> Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.… Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> [ 4.4 backport: removed objtool annotation since there is no objtool ] Signed-off-by: Razvan Ghitulete <rga(a)amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/Kconfig | 13 ++++ arch/x86/Makefile | 10 +++ arch/x86/include/asm/asm-prototypes.h | 25 ++++++++ arch/x86/include/asm/cpufeature.h | 2 arch/x86/include/asm/nospec-branch.h | 106 ++++++++++++++++++++++++++++++++++ arch/x86/kernel/cpu/common.c | 4 + arch/x86/lib/Makefile | 1 arch/x86/lib/retpoline.S | 48 +++++++++++++++ 8 files changed, 209 insertions(+) create mode 100644 arch/x86/include/asm/nospec-branch.h create mode 100644 arch/x86/lib/retpoline.S --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -379,6 +379,19 @@ config GOLDFISH def_bool y depends on X86_GOLDFISH +config RETPOLINE + bool "Avoid speculative indirect branches in kernel" + default y + ---help--- + Compile kernel with the retpoline compiler options to guard against + kernel-to-user data leaks by avoiding speculative indirect + branches. Requires a compiler with -mindirect-branch=thunk-extern + support for full protection. The kernel may run slower. + + Without compiler support, at least indirect branches in assembler + code are eliminated. Since this includes the syscall entry path, + it is not entirely pointless. + if X86_32 config X86_EXTENDED_PLATFORM bool "Support for extended (non-PC) x86 platforms" --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -189,6 +189,16 @@ KBUILD_CFLAGS += -fno-asynchronous-unwin KBUILD_CFLAGS += $(mflags-y) KBUILD_AFLAGS += $(mflags-y) +# Avoid indirect branches in kernel to deal with Spectre +ifdef CONFIG_RETPOLINE + RETPOLINE_CFLAGS += $(call cc-option,-mindirect-branch=thunk-extern -mindirect-branch-register) + ifneq ($(RETPOLINE_CFLAGS),) + KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE + else + $(warning CONFIG_RETPOLINE=y, but not supported by the compiler. Toolchain update recommended.) + endif +endif + archscripts: scripts_basic $(Q)$(MAKE) $(build)=arch/x86/tools relocs --- a/arch/x86/include/asm/asm-prototypes.h +++ b/arch/x86/include/asm/asm-prototypes.h @@ -10,7 +10,32 @@ #include <asm/pgtable.h> #include <asm/special_insns.h> #include <asm/preempt.h> +#include <asm/asm.h> #ifndef CONFIG_X86_CMPXCHG64 extern void cmpxchg8b_emu(void); #endif + +#ifdef CONFIG_RETPOLINE +#ifdef CONFIG_X86_32 +#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_e ## reg(void); +#else +#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_r ## reg(void); +INDIRECT_THUNK(8) +INDIRECT_THUNK(9) +INDIRECT_THUNK(10) +INDIRECT_THUNK(11) +INDIRECT_THUNK(12) +INDIRECT_THUNK(13) +INDIRECT_THUNK(14) +INDIRECT_THUNK(15) +#endif +INDIRECT_THUNK(ax) +INDIRECT_THUNK(bx) +INDIRECT_THUNK(cx) +INDIRECT_THUNK(dx) +INDIRECT_THUNK(si) +INDIRECT_THUNK(di) +INDIRECT_THUNK(bp) +INDIRECT_THUNK(sp) +#endif /* CONFIG_RETPOLINE */ --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -200,6 +200,8 @@ #define X86_FEATURE_HWP_PKG_REQ ( 7*32+14) /* Intel HWP_PKG_REQ */ #define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */ +#define X86_FEATURE_RETPOLINE ( 7*32+29) /* Generic Retpoline mitigation for Spectre variant 2 */ +#define X86_FEATURE_RETPOLINE_AMD ( 7*32+30) /* AMD Retpoline mitigation for Spectre variant 2 */ /* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */ #define X86_FEATURE_KAISER ( 7*32+31) /* CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */ --- /dev/null +++ b/arch/x86/include/asm/nospec-branch.h @@ -0,0 +1,106 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __NOSPEC_BRANCH_H__ +#define __NOSPEC_BRANCH_H__ + +#include <asm/alternative.h> +#include <asm/alternative-asm.h> +#include <asm/cpufeature.h> + +#ifdef __ASSEMBLY__ + +/* + * These are the bare retpoline primitives for indirect jmp and call. + * Do not use these directly; they only exist to make the ALTERNATIVE + * invocation below less ugly. + */ +.macro RETPOLINE_JMP reg:req + call .Ldo_rop_\@ +.Lspec_trap_\@: + pause + jmp .Lspec_trap_\@ +.Ldo_rop_\@: + mov \reg, (%_ASM_SP) + ret +.endm + +/* + * This is a wrapper around RETPOLINE_JMP so the called function in reg + * returns to the instruction after the macro. + */ +.macro RETPOLINE_CALL reg:req + jmp .Ldo_call_\@ +.Ldo_retpoline_jmp_\@: + RETPOLINE_JMP \reg +.Ldo_call_\@: + call .Ldo_retpoline_jmp_\@ +.endm + +/* + * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple + * indirect jmp/call which may be susceptible to the Spectre variant 2 + * attack. + */ +.macro JMP_NOSPEC reg:req +#ifdef CONFIG_RETPOLINE + ALTERNATIVE_2 __stringify(jmp *\reg), \ + __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ + __stringify(lfence; jmp *\reg), X86_FEATURE_RETPOLINE_AMD +#else + jmp *\reg +#endif +.endm + +.macro CALL_NOSPEC reg:req +#ifdef CONFIG_RETPOLINE + ALTERNATIVE_2 __stringify(call *\reg), \ + __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\ + __stringify(lfence; call *\reg), X86_FEATURE_RETPOLINE_AMD +#else + call *\reg +#endif +.endm + +#else /* __ASSEMBLY__ */ + +#if defined(CONFIG_X86_64) && defined(RETPOLINE) + +/* + * Since the inline asm uses the %V modifier which is only in newer GCC, + * the 64-bit one is dependent on RETPOLINE not CONFIG_RETPOLINE. + */ +# define CALL_NOSPEC \ + ALTERNATIVE( \ + "call *%[thunk_target]\n", \ + "call __x86_indirect_thunk_%V[thunk_target]\n", \ + X86_FEATURE_RETPOLINE) +# define THUNK_TARGET(addr) [thunk_target] "r" (addr) + +#elif defined(CONFIG_X86_32) && defined(CONFIG_RETPOLINE) +/* + * For i386 we use the original ret-equivalent retpoline, because + * otherwise we'll run out of registers. We don't care about CET + * here, anyway. + */ +# define CALL_NOSPEC ALTERNATIVE("call *%[thunk_target]\n", \ + " jmp 904f;\n" \ + " .align 16\n" \ + "901: call 903f;\n" \ + "902: pause;\n" \ + " jmp 902b;\n" \ + " .align 16\n" \ + "903: addl $4, %%esp;\n" \ + " pushl %[thunk_target];\n" \ + " ret;\n" \ + " .align 16\n" \ + "904: call 901b;\n", \ + X86_FEATURE_RETPOLINE) + +# define THUNK_TARGET(addr) [thunk_target] "rm" (addr) +#else /* No retpoline */ +# define CALL_NOSPEC "call *%[thunk_target]\n" +# define THUNK_TARGET(addr) [thunk_target] "rm" (addr) +#endif + +#endif /* __ASSEMBLY__ */ +#endif /* __NOSPEC_BRANCH_H__ */ --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -837,6 +837,10 @@ static void __init early_identify_cpu(st setup_force_cpu_bug(X86_BUG_SPECTRE_V1); setup_force_cpu_bug(X86_BUG_SPECTRE_V2); +#ifdef CONFIG_RETPOLINE + setup_force_cpu_cap(X86_FEATURE_RETPOLINE); +#endif + fpu__init_system(c); #ifdef CONFIG_X86_32 --- a/arch/x86/lib/Makefile +++ b/arch/x86/lib/Makefile @@ -21,6 +21,7 @@ lib-y += usercopy_$(BITS).o usercopy.o g lib-y += memcpy_$(BITS).o lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o +lib-$(CONFIG_RETPOLINE) += retpoline.o obj-y += msr.o msr-reg.o msr-reg-export.o --- /dev/null +++ b/arch/x86/lib/retpoline.S @@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#include <linux/stringify.h> +#include <linux/linkage.h> +#include <asm/dwarf2.h> +#include <asm/cpufeature.h> +#include <asm/alternative-asm.h> +#include <asm-generic/export.h> +#include <asm/nospec-branch.h> + +.macro THUNK reg + .section .text.__x86.indirect_thunk.\reg + +ENTRY(__x86_indirect_thunk_\reg) + CFI_STARTPROC + JMP_NOSPEC %\reg + CFI_ENDPROC +ENDPROC(__x86_indirect_thunk_\reg) +.endm + +/* + * Despite being an assembler file we can't just use .irp here + * because __KSYM_DEPS__ only uses the C preprocessor and would + * only see one instance of "__x86_indirect_thunk_\reg" rather + * than one per register with the correct names. So we do it + * the simple and nasty way... + */ +#define EXPORT_THUNK(reg) EXPORT_SYMBOL(__x86_indirect_thunk_ ## reg) +#define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg) + +GENERATE_THUNK(_ASM_AX) +GENERATE_THUNK(_ASM_BX) +GENERATE_THUNK(_ASM_CX) +GENERATE_THUNK(_ASM_DX) +GENERATE_THUNK(_ASM_SI) +GENERATE_THUNK(_ASM_DI) +GENERATE_THUNK(_ASM_BP) +GENERATE_THUNK(_ASM_SP) +#ifdef CONFIG_64BIT +GENERATE_THUNK(r8) +GENERATE_THUNK(r9) +GENERATE_THUNK(r10) +GENERATE_THUNK(r11) +GENERATE_THUNK(r12) +GENERATE_THUNK(r13) +GENERATE_THUNK(r14) +GENERATE_THUNK(r15) +#endif Patches currently in stable-queue which might be from dwmw(a)amazon.co.uk are queue-4.4/x86-spectre-add-boot-time-option-to-select-spectre-v2-mitigation.patch queue-4.4/x86-retpoline-irq32-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-hyperv-convert-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-entry-convert-entry-assembler-indirect-jumps.patch queue-4.4/x86-cpu-amd-make-lfence-a-serializing-instruction.patch queue-4.4/x86-retpoline-ftrace-convert-ftrace-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-crypto-convert-crypto-assembler-indirect-jumps.patch queue-4.4/x86-retpoline-xen-convert-xen-hypercall-indirect-jumps.patch queue-4.4/x86-retpoline-checksum32-convert-assembler-indirect-jumps.patch queue-4.4/x86-mm-32-move-setup_clear_cpu_cap-x86_feature_pcid-earlier.patch queue-4.4/x86-retpoline-fill-return-stack-buffer-on-vmexit.patch queue-4.4/x86-retpoline-remove-compile-time-warning.patch queue-4.4/x86-cpu-amd-use-lfence_rdtsc-in-preference-to-mfence_rdtsc.patch queue-4.4/x86-retpoline-add-initial-retpoline-support.patch

7 years, 8 months

1
0
0 0

Re: [Linux-stable-mirror] Linux 3.18.92

by Greg KH

On Wed, Jan 17, 2018 at 10:38:30AM +0000, Harsh Shandilya wrote: > On Wed 17 Jan, 2018, 3:11 PM Greg KH, <gregkh(a)linuxfoundation.org> wrote: > > > I'm announcing the release of the 3.18.92 kernel. > > > > All users of the 3.18 kernel series must upgrade. > > > > The updated 3.18.y git tree can be found at: > > git:// > > git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git > > linux-3.18.y > > and can be browsed at the normal kernel.org git web browser: > > > > http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary > > > > thanks, > > > > greg k-h > > > > Builds and boots on the OnePlus 3T, no regressions noticed in general usage. Great, thanks for testing and letting me know. greg k-h

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-mm-32-move-setup_clear_cpu_cap-x86_feature_pcid-earlier.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From b8b7abaed7a49b350f8ba659ddc264b04931d581 Mon Sep 17 00:00:00 2001 From: Andy Lutomirski <luto(a)kernel.org> Date: Sun, 17 Sep 2017 09:03:50 -0700 Subject: x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier From: Andy Lutomirski <luto(a)kernel.org> commit b8b7abaed7a49b350f8ba659ddc264b04931d581 upstream. Otherwise we might have the PCID feature bit set during cpu_init(). This is just for robustness. I haven't seen any actual bugs here. Signed-off-by: Andy Lutomirski <luto(a)kernel.org> Cc: Borislav Petkov <bpetkov(a)suse.de> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Fixes: cba4671af755 ("x86/mm: Disable PCID on 32-bit kernels") Link: http://lkml.kernel.org/r/b16dae9d6b0db5d9801ddbebbfd83384097c61f3.150566353… Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: David Woodhouse <dwmw(a)amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/kernel/cpu/bugs.c | 8 -------- arch/x86/kernel/cpu/common.c | 8 ++++++++ 2 files changed, 8 insertions(+), 8 deletions(-) --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -22,14 +22,6 @@ void __init check_bugs(void) { -#ifdef CONFIG_X86_32 - /* - * Regardless of whether PCID is enumerated, the SDM says - * that it can't be enabled in 32-bit mode. - */ - setup_clear_cpu_cap(X86_FEATURE_PCID); -#endif - identify_boot_cpu(); if (!IS_ENABLED(CONFIG_SMP)) { --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -838,6 +838,14 @@ static void __init early_identify_cpu(st setup_force_cpu_bug(X86_BUG_SPECTRE_V2); fpu__init_system(c); + +#ifdef CONFIG_X86_32 + /* + * Regardless of whether PCID is enumerated, the SDM says + * that it can't be enabled in 32-bit mode. + */ + setup_clear_cpu_cap(X86_FEATURE_PCID); +#endif } void __init early_cpu_init(void) Patches currently in stable-queue which might be from luto(a)kernel.org are queue-4.4/x86-asm-use-register-variable-to-get-stack-pointer-value.patch queue-4.4/x86-mm-32-move-setup_clear_cpu_cap-x86_feature_pcid-earlier.patch queue-4.4/x86-asm-make-asm-alternative.h-safe-from-assembly.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/kbuild: enable modversions for symbols exported from asm" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/kbuild: enable modversions for symbols exported from asm to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-kbuild-enable-modversions-for-symbols-exported-from-asm.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 334bb773876403eae3457d81be0b8ea70f8e4ccc Mon Sep 17 00:00:00 2001 From: Adam Borowski <kilobyte(a)angband.pl> Date: Sun, 11 Dec 2016 02:09:18 +0100 Subject: x86/kbuild: enable modversions for symbols exported from asm From: Adam Borowski <kilobyte(a)angband.pl> commit 334bb773876403eae3457d81be0b8ea70f8e4ccc upstream. Commit 4efca4ed ("kbuild: modversions for EXPORT_SYMBOL() for asm") adds modversion support for symbols exported from asm files. Architectures must include C-style declarations for those symbols in asm/asm-prototypes.h in order for them to be versioned. Add these declarations for x86, and an architecture-independent file that can be used for common symbols. With f27c2f6 reverting 8ab2ae6 ("default exported asm symbols to zero") we produce a scary warning on x86, this commit fixes that. Signed-off-by: Adam Borowski <kilobyte(a)angband.pl> Tested-by: Kalle Valo <kvalo(a)codeaurora.org> Acked-by: Nicholas Piggin <npiggin(a)gmail.com> Tested-by: Peter Wu <peter(a)lekensteyn.nl> Tested-by: Oliver Hartkopp <socketcan(a)hartkopp.net> Signed-off-by: Michal Marek <mmarek(a)suse.com> Signed-off-by: Razvan Ghitulete <rga(a)amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/asm-prototypes.h | 16 ++++++++++++++++ include/asm-generic/asm-prototypes.h | 7 +++++++ 2 files changed, 23 insertions(+) --- /dev/null +++ b/arch/x86/include/asm/asm-prototypes.h @@ -0,0 +1,16 @@ +#include <asm/ftrace.h> +#include <asm/uaccess.h> +#include <asm/string.h> +#include <asm/page.h> +#include <asm/checksum.h> + +#include <asm-generic/asm-prototypes.h> + +#include <asm/page.h> +#include <asm/pgtable.h> +#include <asm/special_insns.h> +#include <asm/preempt.h> + +#ifndef CONFIG_X86_CMPXCHG64 +extern void cmpxchg8b_emu(void); +#endif --- /dev/null +++ b/include/asm-generic/asm-prototypes.h @@ -0,0 +1,7 @@ +#include <linux/bitops.h> +extern void *__memset(void *, int, __kernel_size_t); +extern void *__memcpy(void *, const void *, __kernel_size_t); +extern void *__memmove(void *, const void *, __kernel_size_t); +extern void *memset(void *, int, __kernel_size_t); +extern void *memcpy(void *, const void *, __kernel_size_t); +extern void *memmove(void *, const void *, __kernel_size_t); Patches currently in stable-queue which might be from kilobyte(a)angband.pl are queue-4.4/x86-kbuild-enable-modversions-for-symbols-exported-from-asm.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/cpu/AMD: Make LFENCE a serializing instruction" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/cpu/AMD: Make LFENCE a serializing instruction to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-cpu-amd-make-lfence-a-serializing-instruction.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From e4d0e84e490790798691aaa0f2e598637f1867ec Mon Sep 17 00:00:00 2001 From: Tom Lendacky <thomas.lendacky(a)amd.com> Date: Mon, 8 Jan 2018 16:09:21 -0600 Subject: x86/cpu/AMD: Make LFENCE a serializing instruction From: Tom Lendacky <thomas.lendacky(a)amd.com> commit e4d0e84e490790798691aaa0f2e598637f1867ec upstream. To aid in speculation control, make LFENCE a serializing instruction since it has less overhead than MFENCE. This is done by setting bit 1 of MSR 0xc0011029 (DE_CFG). Some families that support LFENCE do not have this MSR. For these families, the LFENCE instruction is already serializing. Signed-off-by: Tom Lendacky <thomas.lendacky(a)amd.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Reviewed-by: Reviewed-by: Borislav Petkov <bp(a)suse.de> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Dan Williams <dan.j.williams(a)intel.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org> Cc: David Woodhouse <dwmw(a)amazon.co.uk> Cc: Paul Turner <pjt(a)google.com> Link: https://lkml.kernel.org/r/20180108220921.12580.71694.stgit@tlendack-t1.amdo… Signed-off-by: Razvan Ghitulete <rga(a)amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/msr-index.h | 2 ++ arch/x86/kernel/cpu/amd.c | 10 ++++++++++ 2 files changed, 12 insertions(+) --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -330,6 +330,8 @@ #define FAM10H_MMIO_CONF_BASE_MASK 0xfffffffULL #define FAM10H_MMIO_CONF_BASE_SHIFT 20 #define MSR_FAM10H_NODE_ID 0xc001100c +#define MSR_F10H_DECFG 0xc0011029 +#define MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT 1 /* K8 MSRs */ #define MSR_K8_TOP_MEM1 0xc001001a --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -746,6 +746,16 @@ static void init_amd(struct cpuinfo_x86 set_cpu_cap(c, X86_FEATURE_K8); if (cpu_has_xmm2) { + /* + * A serializing LFENCE has less overhead than MFENCE, so + * use it for execution serialization. On families which + * don't have that MSR, LFENCE is already serializing. + * msr_set_bit() uses the safe accessors, too, even if the MSR + * is not present. + */ + msr_set_bit(MSR_F10H_DECFG, + MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT); + /* MFENCE stops RDTSC speculation */ set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC); } Patches currently in stable-queue which might be from thomas.lendacky(a)amd.com are queue-4.4/x86-cpu-amd-make-lfence-a-serializing-instruction.patch queue-4.4/x86-cpu-amd-use-lfence_rdtsc-in-preference-to-mfence_rdtsc.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-cpu-amd-use-lfence_rdtsc-in-preference-to-mfence_rdtsc.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 9c6a73c75864ad9fa49e5fa6513e4c4071c0e29f Mon Sep 17 00:00:00 2001 From: Tom Lendacky <thomas.lendacky(a)amd.com> Date: Mon, 8 Jan 2018 16:09:32 -0600 Subject: x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC From: Tom Lendacky <thomas.lendacky(a)amd.com> commit 9c6a73c75864ad9fa49e5fa6513e4c4071c0e29f upstream. With LFENCE now a serializing instruction, use LFENCE_RDTSC in preference to MFENCE_RDTSC. However, since the kernel could be running under a hypervisor that does not support writing that MSR, read the MSR back and verify that the bit has been set successfully. If the MSR can be read and the bit is set, then set the LFENCE_RDTSC feature, otherwise set the MFENCE_RDTSC feature. Signed-off-by: Tom Lendacky <thomas.lendacky(a)amd.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Reviewed-by: Reviewed-by: Borislav Petkov <bp(a)suse.de> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Dan Williams <dan.j.williams(a)intel.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh(a)linux-foundation.org> Cc: David Woodhouse <dwmw(a)amazon.co.uk> Cc: Paul Turner <pjt(a)google.com> Link: https://lkml.kernel.org/r/20180108220932.12580.52458.stgit@tlendack-t1.amdo… Signed-off-by: Razvan Ghitulete <rga(a)amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/msr-index.h | 1 + arch/x86/kernel/cpu/amd.c | 18 ++++++++++++++++-- 2 files changed, 17 insertions(+), 2 deletions(-) --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -332,6 +332,7 @@ #define MSR_FAM10H_NODE_ID 0xc001100c #define MSR_F10H_DECFG 0xc0011029 #define MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT 1 +#define MSR_F10H_DECFG_LFENCE_SERIALIZE BIT_ULL(MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT) /* K8 MSRs */ #define MSR_K8_TOP_MEM1 0xc001001a --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -746,6 +746,9 @@ static void init_amd(struct cpuinfo_x86 set_cpu_cap(c, X86_FEATURE_K8); if (cpu_has_xmm2) { + unsigned long long val; + int ret; + /* * A serializing LFENCE has less overhead than MFENCE, so * use it for execution serialization. On families which @@ -756,8 +759,19 @@ static void init_amd(struct cpuinfo_x86 msr_set_bit(MSR_F10H_DECFG, MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT); - /* MFENCE stops RDTSC speculation */ - set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC); + /* + * Verify that the MSR write was successful (could be running + * under a hypervisor) and only then assume that LFENCE is + * serializing. + */ + ret = rdmsrl_safe(MSR_F10H_DECFG, &val); + if (!ret && (val & MSR_F10H_DECFG_LFENCE_SERIALIZE)) { + /* A serializing LFENCE stops RDTSC speculation */ + set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC); + } else { + /* MFENCE stops RDTSC speculation */ + set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC); + } } /* Patches currently in stable-queue which might be from thomas.lendacky(a)amd.com are queue-4.4/x86-cpu-amd-make-lfence-a-serializing-instruction.patch queue-4.4/x86-cpu-amd-use-lfence_rdtsc-in-preference-to-mfence_rdtsc.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/asm: Use register variable to get stack pointer value" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/asm: Use register variable to get stack pointer value to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-asm-use-register-variable-to-get-stack-pointer-value.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 196bd485ee4f03ce4c690bfcf38138abfcd0a4bc Mon Sep 17 00:00:00 2001 From: Andrey Ryabinin <aryabinin(a)virtuozzo.com> Date: Fri, 29 Sep 2017 17:15:36 +0300 Subject: x86/asm: Use register variable to get stack pointer value From: Andrey Ryabinin <aryabinin(a)virtuozzo.com> commit 196bd485ee4f03ce4c690bfcf38138abfcd0a4bc upstream. Currently we use current_stack_pointer() function to get the value of the stack pointer register. Since commit: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang") ... we have a stack register variable declared. It can be used instead of current_stack_pointer() function which allows to optimize away some excessive "mov %rsp, %<dst>" instructions: -mov %rsp,%rdx -sub %rdx,%rax -cmp $0x3fff,%rax -ja ffffffff810722fd <ist_begin_non_atomic+0x2d> +sub %rsp,%rax +cmp $0x3fff,%rax +ja ffffffff810722fa <ist_begin_non_atomic+0x2a> Remove current_stack_pointer(), rename __asm_call_sp to current_stack_pointer and use it instead of the removed function. Signed-off-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com> Reviewed-by: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Link: http://lkml.kernel.org/r/20170929141537.29167-1-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo(a)kernel.org> [dwmw2: We want ASM_CALL_CONSTRAINT for retpoline] Signed-off-by: David Woodhouse <dwmw(a)amazon.co.ku> Signed-off-by: Razvan Ghitulete <rga(a)amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/asm.h | 11 +++++++++++ arch/x86/include/asm/thread_info.h | 11 ----------- arch/x86/kernel/irq_32.c | 6 +++--- arch/x86/kernel/traps.c | 2 +- 4 files changed, 15 insertions(+), 15 deletions(-) --- a/arch/x86/include/asm/asm.h +++ b/arch/x86/include/asm/asm.h @@ -105,4 +105,15 @@ /* For C file, we already have NOKPROBE_SYMBOL macro */ #endif +#ifndef __ASSEMBLY__ +/* + * This output constraint should be used for any inline asm which has a "call" + * instruction. Otherwise the asm may be inserted before the frame pointer + * gets set up by the containing function. If you forget to do this, objtool + * may print a "call without frame pointer save/setup" warning. + */ +register unsigned long current_stack_pointer asm(_ASM_SP); +#define ASM_CALL_CONSTRAINT "+r" (current_stack_pointer) +#endif + #endif /* _ASM_X86_ASM_H */ --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -166,17 +166,6 @@ static inline struct thread_info *curren return (struct thread_info *)(current_top_of_stack() - THREAD_SIZE); } -static inline unsigned long current_stack_pointer(void) -{ - unsigned long sp; -#ifdef CONFIG_X86_64 - asm("mov %%rsp,%0" : "=g" (sp)); -#else - asm("mov %%esp,%0" : "=g" (sp)); -#endif - return sp; -} - #else /* !__ASSEMBLY__ */ #ifdef CONFIG_X86_64 --- a/arch/x86/kernel/irq_32.c +++ b/arch/x86/kernel/irq_32.c @@ -65,7 +65,7 @@ static void call_on_stack(void *func, vo static inline void *current_stack(void) { - return (void *)(current_stack_pointer() & ~(THREAD_SIZE - 1)); + return (void *)(current_stack_pointer & ~(THREAD_SIZE - 1)); } static inline int execute_on_irq_stack(int overflow, struct irq_desc *desc) @@ -89,7 +89,7 @@ static inline int execute_on_irq_stack(i /* Save the next esp at the bottom of the stack */ prev_esp = (u32 *)irqstk; - *prev_esp = current_stack_pointer(); + *prev_esp = current_stack_pointer; if (unlikely(overflow)) call_on_stack(print_stack_overflow, isp); @@ -142,7 +142,7 @@ void do_softirq_own_stack(void) /* Push the previous esp onto the stack */ prev_esp = (u32 *)irqstk; - *prev_esp = current_stack_pointer(); + *prev_esp = current_stack_pointer; call_on_stack(__do_softirq, isp); } --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -166,7 +166,7 @@ void ist_begin_non_atomic(struct pt_regs * from double_fault. */ BUG_ON((unsigned long)(current_top_of_stack() - - current_stack_pointer()) >= THREAD_SIZE); + current_stack_pointer) >= THREAD_SIZE); preempt_enable_no_resched(); } Patches currently in stable-queue which might be from aryabinin(a)virtuozzo.com are queue-4.4/x86-asm-use-register-variable-to-get-stack-pointer-value.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "kconfig.h: use __is_defined() to check if MODULE is defined" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled kconfig.h: use __is_defined() to check if MODULE is defined to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: kconfig.h-use-__is_defined-to-check-if-module-is-defined.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 4f920843d248946545415c1bf6120942048708ed Mon Sep 17 00:00:00 2001 From: Masahiro Yamada <yamada.masahiro(a)socionext.com> Date: Tue, 14 Jun 2016 14:58:54 +0900 Subject: kconfig.h: use __is_defined() to check if MODULE is defined From: Masahiro Yamada <yamada.masahiro(a)socionext.com> commit 4f920843d248946545415c1bf6120942048708ed upstream. The macro MODULE is not a config option, it is a per-file build option. So, config_enabled(MODULE) is not sensible. (There is another case in include/linux/export.h, where config_enabled() is used against a non-config option.) This commit renames some macros in include/linux/kconfig.h for the use for non-config macros and replaces config_enabled(MODULE) with __is_defined(MODULE). I am keeping config_enabled() because it is still referenced from some places, but I expect it would be deprecated in the future. Signed-off-by: Masahiro Yamada <yamada.masahiro(a)socionext.com> Signed-off-by: Michal Marek <mmarek(a)suse.com> Signed-off-by: Razvan Ghitulete <rga(a)amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- include/linux/kconfig.h | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) --- a/include/linux/kconfig.h +++ b/include/linux/kconfig.h @@ -17,10 +17,11 @@ * the last step cherry picks the 2nd arg, we get a zero. */ #define __ARG_PLACEHOLDER_1 0, -#define config_enabled(cfg) _config_enabled(cfg) -#define _config_enabled(value) __config_enabled(__ARG_PLACEHOLDER_##value) -#define __config_enabled(arg1_or_junk) ___config_enabled(arg1_or_junk 1, 0) -#define ___config_enabled(__ignored, val, ...) val +#define config_enabled(cfg) ___is_defined(cfg) +#define __is_defined(x) ___is_defined(x) +#define ___is_defined(val) ____is_defined(__ARG_PLACEHOLDER_##val) +#define ____is_defined(arg1_or_junk) __take_second_arg(arg1_or_junk 1, 0) +#define __take_second_arg(__ignored, val, ...) val /* * IS_BUILTIN(CONFIG_FOO) evaluates to 1 if CONFIG_FOO is set to 'y', 0 @@ -42,7 +43,7 @@ * built-in code when CONFIG_FOO is set to 'm'. */ #define IS_REACHABLE(option) (config_enabled(option) || \ - (config_enabled(option##_MODULE) && config_enabled(MODULE))) + (config_enabled(option##_MODULE) && __is_defined(MODULE))) /* * IS_ENABLED(CONFIG_FOO) evaluates to 1 if CONFIG_FOO is set to 'y' or 'm', Patches currently in stable-queue which might be from yamada.masahiro(a)socionext.com are queue-4.4/kconfig.h-use-__is_defined-to-check-if-module-is-defined.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/asm: Make asm/alternative.h safe from assembly" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/asm: Make asm/alternative.h safe from assembly to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-asm-make-asm-alternative.h-safe-from-assembly.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From f005f5d860e0231fe212cfda8c1a3148b99609f4 Mon Sep 17 00:00:00 2001 From: Andy Lutomirski <luto(a)kernel.org> Date: Tue, 26 Apr 2016 12:23:25 -0700 Subject: x86/asm: Make asm/alternative.h safe from assembly From: Andy Lutomirski <luto(a)kernel.org> commit f005f5d860e0231fe212cfda8c1a3148b99609f4 upstream. asm/alternative.h isn't directly useful from assembly, but it shouldn't break the build. Signed-off-by: Andy Lutomirski <luto(a)kernel.org> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Link: http://lkml.kernel.org/r/e5b693fcef99fe6e80341c9e97a002fb23871e91.146169831… Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Razvan Ghitulete <rga(a)amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/alternative.h | 4 ++++ 1 file changed, 4 insertions(+) --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -1,6 +1,8 @@ #ifndef _ASM_X86_ALTERNATIVE_H #define _ASM_X86_ALTERNATIVE_H +#ifndef __ASSEMBLY__ + #include <linux/types.h> #include <linux/stddef.h> #include <linux/stringify.h> @@ -271,4 +273,6 @@ extern void *text_poke(void *addr, const extern int poke_int3_handler(struct pt_regs *regs); extern void *text_poke_bp(void *addr, const void *opcode, size_t len, void *handler); +#endif /* __ASSEMBLY__ */ + #endif /* _ASM_X86_ALTERNATIVE_H */ Patches currently in stable-queue which might be from luto(a)kernel.org are queue-4.4/x86-asm-use-register-variable-to-get-stack-pointer-value.patch queue-4.4/x86-mm-32-move-setup_clear_cpu_cap-x86_feature_pcid-earlier.patch queue-4.4/x86-asm-make-asm-alternative.h-safe-from-assembly.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "EXPORT_SYMBOL() for asm" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled EXPORT_SYMBOL() for asm to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: export_symbol-for-asm.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 22823ab419d8ed884195cfa75483fd3a99bb1462 Mon Sep 17 00:00:00 2001 From: Al Viro <viro(a)zeniv.linux.org.uk> Date: Mon, 11 Jan 2016 10:54:54 -0500 Subject: EXPORT_SYMBOL() for asm From: Al Viro <viro(a)zeniv.linux.org.uk> commit 22823ab419d8ed884195cfa75483fd3a99bb1462 upstream. Add asm-usable variants of EXPORT_SYMBOL/EXPORT_SYMBOL_GPL. This commit just adds the default implementation; most of the architectures can simply add export.h to asm/Kbuild and start using <asm/export.h> from assembler. The rest needs to have their <asm/export.h> define everal macros and then explicitly include <asm-generic/export.h> One area where the things might diverge from default is the alignment; normally it's 8 bytes on 64bit targets and 4 on 32bit ones, both for unsigned long and for struct kernel_symbol. Unfortunately, amd64 and m68k are unusual - m68k aligns to 2 bytes (for both) and amd64 aligns struct kernel_symbol to 16 bytes. For those we'll need asm/export.h to override the constants used by generic version - KSYM_ALIGN and KCRC_ALIGN for kernel_symbol and unsigned long resp. And no, __alignof__ would not do the trick - on amd64 __alignof__ of struct kernel_symbol is 8, not 16. More serious source of unpleasantness is treatment of function descriptors on architectures that have those. Things like ppc64, parisc, ia64, etc. need more than the address of the first insn to call an arbitrary function. As the result, their representation of pointers to functions is not the typical "address of the entry point" - it's an address of a small static structure containing all the required information (including the entry point, of course). Sadly, the asm-side conventions differ in what the function name refers to - entry point or the function descriptor. On ppc64 we do the latter; bar: .quad foo is what void (*bar)(void) = foo; turns into and the rare places where we need to explicitly work with the label of entry point are dealt with as DOTSYM(foo). For our purposes it's ideal - generic macros are usable. However, parisc would have foo and P%foo used for label of entry point and address of the function descriptor and bar: .long P%foo woudl be used instead. ia64 goes similar to parisc in that respect, except that there it's @fptr(foo) rather than P%foo. Such architectures need to define KSYM_FUNC that would turn a function name into whatever is needed to refer to function descriptor. What's more, on such architectures we need to know whether we are exporting a function or an object - in assembler we have to tell that explicitly, to decide whether we want EXPORT_SYMBOL(foo) produce e.g. __ksymtab_foo: .quad foo or __ksymtab_foo: .quad @fptr(foo) For that reason we introduce EXPORT_DATA_SYMBOL{,_GPL}(), to be used for exports of data objects. On normal architectures it's the same thing as EXPORT_SYMBOL{,_GPL}(), but on parisc-like ones they differ and the right one needs to be used. Most of the exports are functions, so we keep EXPORT_SYMBOL for those... Signed-off-by: Al Viro <viro(a)zeniv.linux.org.uk> Signed-off-by: Razvan Ghitulete <rga(a)amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- include/asm-generic/export.h | 94 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 94 insertions(+) --- /dev/null +++ b/include/asm-generic/export.h @@ -0,0 +1,94 @@ +#ifndef __ASM_GENERIC_EXPORT_H +#define __ASM_GENERIC_EXPORT_H + +#ifndef KSYM_FUNC +#define KSYM_FUNC(x) x +#endif +#ifdef CONFIG_64BIT +#define __put .quad +#ifndef KSYM_ALIGN +#define KSYM_ALIGN 8 +#endif +#ifndef KCRC_ALIGN +#define KCRC_ALIGN 8 +#endif +#else +#define __put .long +#ifndef KSYM_ALIGN +#define KSYM_ALIGN 4 +#endif +#ifndef KCRC_ALIGN +#define KCRC_ALIGN 4 +#endif +#endif + +#ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX +#define KSYM(name) _##name +#else +#define KSYM(name) name +#endif + +/* + * note on .section use: @progbits vs %progbits nastiness doesn't matter, + * since we immediately emit into those sections anyway. + */ +.macro ___EXPORT_SYMBOL name,val,sec +#ifdef CONFIG_MODULES + .globl KSYM(__ksymtab_\name) + .section ___ksymtab\sec+\name,"a" + .balign KSYM_ALIGN +KSYM(__ksymtab_\name): + __put \val, KSYM(__kstrtab_\name) + .previous + .section __ksymtab_strings,"a" +KSYM(__kstrtab_\name): +#ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX + .asciz "_\name" +#else + .asciz "\name" +#endif + .previous +#ifdef CONFIG_MODVERSIONS + .section ___kcrctab\sec+\name,"a" + .balign KCRC_ALIGN +KSYM(__kcrctab_\name): + __put KSYM(__crc_\name) + .weak KSYM(__crc_\name) + .previous +#endif +#endif +.endm +#undef __put + +#if defined(__KSYM_DEPS__) + +#define __EXPORT_SYMBOL(sym, val, sec) === __KSYM_##sym === + +#elif defined(CONFIG_TRIM_UNUSED_KSYMS) + +#include <linux/kconfig.h> +#include <generated/autoksyms.h> + +#define __EXPORT_SYMBOL(sym, val, sec) \ + __cond_export_sym(sym, val, sec, config_enabled(__KSYM_##sym)) +#define __cond_export_sym(sym, val, sec, conf) \ + ___cond_export_sym(sym, val, sec, conf) +#define ___cond_export_sym(sym, val, sec, enabled) \ + __cond_export_sym_##enabled(sym, val, sec) +#define __cond_export_sym_1(sym, val, sec) ___EXPORT_SYMBOL sym, val, sec +#define __cond_export_sym_0(sym, val, sec) /* nothing */ + +#else +#define __EXPORT_SYMBOL(sym, val, sec) ___EXPORT_SYMBOL sym, val, sec +#endif + +#define EXPORT_SYMBOL(name) \ + __EXPORT_SYMBOL(name, KSYM_FUNC(KSYM(name)),) +#define EXPORT_SYMBOL_GPL(name) \ + __EXPORT_SYMBOL(name, KSYM_FUNC(KSYM(name)), _gpl) +#define EXPORT_DATA_SYMBOL(name) \ + __EXPORT_SYMBOL(name, KSYM(name),) +#define EXPORT_DATA_SYMBOL_GPL(name) \ + __EXPORT_SYMBOL(name, KSYM(name),_gpl) + +#endif Patches currently in stable-queue which might be from viro(a)zeniv.linux.org.uk are queue-4.4/export_symbol-for-asm.patch

7 years, 8 months

1
0
0 0

Re: [Linux-stable-mirror] Linux 4.9.77

by Greg KH

On Wed, Jan 17, 2018 at 10:54:02AM +0100, Pavlos Parissis wrote: > On 17/01/2018 10:42 πμ, Greg KH wrote: > > I'm announcing the release of the 4.9.77 kernel. > > > > All users of the 4.9 kernel series must upgrade. > > > > The updated 4.9.y git tree can be found at: > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.9.y > > and can be browsed at the normal kernel.org git web browser: > > http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary > > > > The changelog isn't available in www.kernel.org as > https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.9.77 returns 404 HTTP status. Odd. I'm doing this from a train, so maybe that file didn't get pushed out properly from my side, I'll do it again and hopefully it will "stick" this time... thanks, greg k-h

7 years, 8 months

2
2
0 0

[Linux-stable-mirror] Patch "gcov: disable for COMPILE_TEST" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled gcov: disable for COMPILE_TEST to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: gcov-disable-for-compile_test.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From cc622420798c4bcf093785d872525087a7798db9 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann <arnd(a)arndb.de> Date: Mon, 25 Apr 2016 17:35:29 +0200 Subject: gcov: disable for COMPILE_TEST From: Arnd Bergmann <arnd(a)arndb.de> commit cc622420798c4bcf093785d872525087a7798db9 upstream. Enabling gcov is counterproductive to compile testing: it significantly increases the kernel image size, compile time, and it produces lots of false positive "may be used uninitialized" warnings as the result of missed optimizations. This is in line with how UBSAN_SANITIZE_ALL and PROFILE_ALL_BRANCHES work, both of which have similar problems. With an ARM allmodconfig kernel, I see the build time drop from 283 minutes CPU time to 225 minutes, and the vmlinux size drops from 43MB to 26MB. Signed-off-by: Arnd Bergmann <arnd(a)arndb.de> Acked-by: Peter Oberparleiter <oberpar(a)linux.vnet.ibm.com> Signed-off-by: Michal Marek <mmarek(a)suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/gcov/Kconfig | 1 + 1 file changed, 1 insertion(+) --- a/kernel/gcov/Kconfig +++ b/kernel/gcov/Kconfig @@ -37,6 +37,7 @@ config ARCH_HAS_GCOV_PROFILE_ALL config GCOV_PROFILE_ALL bool "Profile entire Kernel" + depends on !COMPILE_TEST depends on GCOV_KERNEL depends on ARCH_HAS_GCOV_PROFILE_ALL default n Patches currently in stable-queue which might be from arnd(a)arndb.de are queue-4.4/gcov-disable-for-compile_test.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "gcov: disable for COMPILE_TEST" has been added to the 3.18-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled gcov: disable for COMPILE_TEST to the 3.18-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: gcov-disable-for-compile_test.patch and it can be found in the queue-3.18 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From cc622420798c4bcf093785d872525087a7798db9 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann <arnd(a)arndb.de> Date: Mon, 25 Apr 2016 17:35:29 +0200 Subject: gcov: disable for COMPILE_TEST From: Arnd Bergmann <arnd(a)arndb.de> commit cc622420798c4bcf093785d872525087a7798db9 upstream. Enabling gcov is counterproductive to compile testing: it significantly increases the kernel image size, compile time, and it produces lots of false positive "may be used uninitialized" warnings as the result of missed optimizations. This is in line with how UBSAN_SANITIZE_ALL and PROFILE_ALL_BRANCHES work, both of which have similar problems. With an ARM allmodconfig kernel, I see the build time drop from 283 minutes CPU time to 225 minutes, and the vmlinux size drops from 43MB to 26MB. Signed-off-by: Arnd Bergmann <arnd(a)arndb.de> Acked-by: Peter Oberparleiter <oberpar(a)linux.vnet.ibm.com> Signed-off-by: Michal Marek <mmarek(a)suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/gcov/Kconfig | 1 + 1 file changed, 1 insertion(+) --- a/kernel/gcov/Kconfig +++ b/kernel/gcov/Kconfig @@ -34,6 +34,7 @@ config GCOV_KERNEL config GCOV_PROFILE_ALL bool "Profile entire Kernel" + depends on !COMPILE_TEST depends on GCOV_KERNEL depends on SUPERH || S390 || X86 || PPC || MICROBLAZE || ARM || ARM64 default n Patches currently in stable-queue which might be from arnd(a)arndb.de are queue-3.18/gcov-disable-for-compile_test.patch

7 years, 8 months

1
0
0 0

Re: [Linux-stable-mirror] stable-rc build: 20680 warnings 0 failures (stable-rc/v3.18.91-47-g4dc79b5)

by Arnd Bergmann

On Mon, Jan 15, 2018 at 2:30 PM, Olof's autobuilder <build(a)lixom.net> wrote: > Here are the build results from automated periodic testing. > > The tree being built was stable-rc, found at: > > URL: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > > Branch: linux-3.18.y > > Topmost commits: > 4dc79b5 Linux 3.18.92-rc1 > 00d0655 e1000e: Fix e1000_check_for_copper_link_ich8lan return value. > b944e64 uas: ignore UAS for Norelsys NS1068(X) chips > > Runtime: 28m 37s > > Passed: 131 > > Warnings: 20680 > > No errors > > > Warnings: > > arm64.allmodconfig: > /tmp/ccmFH2sM.s:77: Warning: ignoring incorrect section type for .init_array.00100 > /tmp/ccmFH2sM.s:90: Warning: ignoring incorrect section type for .fini_array.00100 > /tmp/ccCNAAf2.s:50: Warning: ignoring incorrect section type for .init_array.00100 > /tmp/ccCNAAf2.s:63: Warning: ignoring incorrect section type for .fini_array.00100 > /tmp/cckbGzsX.s:337: Warning: ignoring incorrect section type for .init_array.00100 > /tmp/cckbGzsX.s:350: Warning: ignoring incorrect section type for .fini_array.00100 This is the result of a bug in the assembler that has since been fixed. The warning itself is apparently harmless, but it's annoying to get 20000 lines of warnings for a simple allmodconfig build. I would suggest backporting commit cc622420798c ("gcov: disable for COMPILE_TEST") to all stable kernels 3.18, 4.1 and 4.4 that are affected by this. Olof's build bot only reported it for 3.18, but my interpretation is that he uses an older toolchain for that kernel, which triggers this warning, while newer assemblers are fixed. The warning showed up in the past few days after Olof's build scripts got adapted to also report assembler warnings, rather than just compiler warnings. The intention of the cc622420798c commit was to help with other issues of compile testing, fixing this particular warning was an unintended side-effect. Adding it to stable kernels will also help with the other issues it addressed at the time, in particular CPU usage during 'allmodconfig' build testing. Arnd

7 years, 8 months

2
1
0 0

Re: [Linux-stable-mirror] Linux 4.9.77

by Willy Tarreau

Hi Pavlos, On Wed, Jan 17, 2018 at 10:54:02AM +0100, Pavlos Parissis wrote: > On 17/01/2018 10:42 ?u, Greg KH wrote: > > I'm announcing the release of the 4.9.77 kernel. > > > > All users of the 4.9 kernel series must upgrade. > > > > The updated 4.9.y git tree can be found at: > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.9.y > > and can be browsed at the normal kernel.org git web browser: > > http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary > > > > The changelog isn't available in www.kernel.org as > https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.9.77 returns 404 HTTP status. Sometimes it takes a few extra minutes for such files to propagate. Let's wait another hour to see ;-) cheers, Willy

7 years, 8 months

1
0
0 0

Re: [Linux-stable-mirror] Linux 4.14.14

by Greg KH

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index f3d5817c4ef0..258902db14bf 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -373,3 +373,19 @@ Contact: Linux kernel mailing list <linux-kernel(a)vger.kernel.org> Description: information about CPUs heterogeneity. cpu_capacity: capacity of cpu#. + +What: /sys/devices/system/cpu/vulnerabilities + /sys/devices/system/cpu/vulnerabilities/meltdown + /sys/devices/system/cpu/vulnerabilities/spectre_v1 + /sys/devices/system/cpu/vulnerabilities/spectre_v2 +Date: January 2018 +Contact: Linux kernel mailing list <linux-kernel(a)vger.kernel.org> +Description: Information about CPU vulnerabilities + + The files are named after the code names of CPU + vulnerabilities. The output of those files reflects the + state of the CPUs in the system. Possible output values: + + "Not affected" CPU is not affected by the vulnerability + "Vulnerable" CPU is affected and no mitigation in effect + "Mitigation: $M" CPU is affected and mitigation $M is in effect diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 520fdec15bbb..8122b5f98ea1 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2599,6 +2599,11 @@ nosmt [KNL,S390] Disable symmetric multithreading (SMT). Equivalent to smt=1. + nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2 + (indirect branch prediction) vulnerability. System may + allow data leaks with this option, which is equivalent + to spectre_v2=off. + noxsave [BUGS=X86] Disables x86 extended register state save and restore using xsave. The kernel will fallback to enabling legacy floating-point and sse state. @@ -2685,8 +2690,6 @@ steal time is computed, but won't influence scheduler behaviour - nopti [X86-64] Disable kernel page table isolation - nolapic [X86-32,APIC] Do not enable or use the local APIC. nolapic_timer [X86-32,APIC] Do not use the local APIC timer. @@ -3255,11 +3258,20 @@ pt. [PARIDE] See Documentation/blockdev/paride.txt. - pti= [X86_64] - Control user/kernel address space isolation: - on - enable - off - disable - auto - default setting + pti= [X86_64] Control Page Table Isolation of user and + kernel address spaces. Disabling this feature + removes hardening, but improves performance of + system calls and interrupts. + + on - unconditionally enable + off - unconditionally disable + auto - kernel detects whether your CPU model is + vulnerable to issues that PTI mitigates + + Not specifying this option is equivalent to pti=auto. + + nopti [X86_64] + Equivalent to pti=off pty.legacy_count= [KNL] Number of legacy pty's. Overwrites compiled-in @@ -3901,6 +3913,29 @@ sonypi.*= [HW] Sony Programmable I/O Control Device driver See Documentation/laptops/sonypi.txt + spectre_v2= [X86] Control mitigation of Spectre variant 2 + (indirect branch speculation) vulnerability. + + on - unconditionally enable + off - unconditionally disable + auto - kernel detects whether your CPU model is + vulnerable + + Selecting 'on' will, and 'auto' may, choose a + mitigation method at run time according to the + CPU, the available microcode, the setting of the + CONFIG_RETPOLINE configuration option, and the + compiler with which the kernel was built. + + Specific mitigations can also be selected manually: + + retpoline - replace indirect branches + retpoline,generic - google's original retpoline + retpoline,amd - AMD-specific minimal thunk + + Not specifying this option is equivalent to + spectre_v2=auto. + spia_io_base= [HW,MTD] spia_fio_base= spia_pedr= diff --git a/Documentation/x86/pti.txt b/Documentation/x86/pti.txt new file mode 100644 index 000000000000..d11eff61fc9a --- /dev/null +++ b/Documentation/x86/pti.txt @@ -0,0 +1,186 @@ +Overview +======== + +Page Table Isolation (pti, previously known as KAISER[1]) is a +countermeasure against attacks on the shared user/kernel address +space such as the "Meltdown" approach[2]. + +To mitigate this class of attacks, we create an independent set of +page tables for use only when running userspace applications. When +the kernel is entered via syscalls, interrupts or exceptions, the +page tables are switched to the full "kernel" copy. When the system +switches back to user mode, the user copy is used again. + +The userspace page tables contain only a minimal amount of kernel +data: only what is needed to enter/exit the kernel such as the +entry/exit functions themselves and the interrupt descriptor table +(IDT). There are a few strictly unnecessary things that get mapped +such as the first C function when entering an interrupt (see +comments in pti.c). + +This approach helps to ensure that side-channel attacks leveraging +the paging structures do not function when PTI is enabled. It can be +enabled by setting CONFIG_PAGE_TABLE_ISOLATION=y at compile time. +Once enabled at compile-time, it can be disabled at boot with the +'nopti' or 'pti=' kernel parameters (see kernel-parameters.txt). + +Page Table Management +===================== + +When PTI is enabled, the kernel manages two sets of page tables. +The first set is very similar to the single set which is present in +kernels without PTI. This includes a complete mapping of userspace +that the kernel can use for things like copy_to_user(). + +Although _complete_, the user portion of the kernel page tables is +crippled by setting the NX bit in the top level. This ensures +that any missed kernel->user CR3 switch will immediately crash +userspace upon executing its first instruction. + +The userspace page tables map only the kernel data needed to enter +and exit the kernel. This data is entirely contained in the 'struct +cpu_entry_area' structure which is placed in the fixmap which gives +each CPU's copy of the area a compile-time-fixed virtual address. + +For new userspace mappings, the kernel makes the entries in its +page tables like normal. The only difference is when the kernel +makes entries in the top (PGD) level. In addition to setting the +entry in the main kernel PGD, a copy of the entry is made in the +userspace page tables' PGD. + +This sharing at the PGD level also inherently shares all the lower +layers of the page tables. This leaves a single, shared set of +userspace page tables to manage. One PTE to lock, one set of +accessed bits, dirty bits, etc... + +Overhead +======== + +Protection against side-channel attacks is important. But, +this protection comes at a cost: + +1. Increased Memory Use + a. Each process now needs an order-1 PGD instead of order-0. + (Consumes an additional 4k per process). + b. The 'cpu_entry_area' structure must be 2MB in size and 2MB + aligned so that it can be mapped by setting a single PMD + entry. This consumes nearly 2MB of RAM once the kernel + is decompressed, but no space in the kernel image itself. + +2. Runtime Cost + a. CR3 manipulation to switch between the page table copies + must be done at interrupt, syscall, and exception entry + and exit (it can be skipped when the kernel is interrupted, + though.) Moves to CR3 are on the order of a hundred + cycles, and are required at every entry and exit. + b. A "trampoline" must be used for SYSCALL entry. This + trampoline depends on a smaller set of resources than the + non-PTI SYSCALL entry code, so requires mapping fewer + things into the userspace page tables. The downside is + that stacks must be switched at entry time. + d. Global pages are disabled for all kernel structures not + mapped into both kernel and userspace page tables. This + feature of the MMU allows different processes to share TLB + entries mapping the kernel. Losing the feature means more + TLB misses after a context switch. The actual loss of + performance is very small, however, never exceeding 1%. + d. Process Context IDentifiers (PCID) is a CPU feature that + allows us to skip flushing the entire TLB when switching page + tables by setting a special bit in CR3 when the page tables + are changed. This makes switching the page tables (at context + switch, or kernel entry/exit) cheaper. But, on systems with + PCID support, the context switch code must flush both the user + and kernel entries out of the TLB. The user PCID TLB flush is + deferred until the exit to userspace, minimizing the cost. + See intel.com/sdm for the gory PCID/INVPCID details. + e. The userspace page tables must be populated for each new + process. Even without PTI, the shared kernel mappings + are created by copying top-level (PGD) entries into each + new process. But, with PTI, there are now *two* kernel + mappings: one in the kernel page tables that maps everything + and one for the entry/exit structures. At fork(), we need to + copy both. + f. In addition to the fork()-time copying, there must also + be an update to the userspace PGD any time a set_pgd() is done + on a PGD used to map userspace. This ensures that the kernel + and userspace copies always map the same userspace + memory. + g. On systems without PCID support, each CR3 write flushes + the entire TLB. That means that each syscall, interrupt + or exception flushes the TLB. + h. INVPCID is a TLB-flushing instruction which allows flushing + of TLB entries for non-current PCIDs. Some systems support + PCIDs, but do not support INVPCID. On these systems, addresses + can only be flushed from the TLB for the current PCID. When + flushing a kernel address, we need to flush all PCIDs, so a + single kernel address flush will require a TLB-flushing CR3 + write upon the next use of every PCID. + +Possible Future Work +==================== +1. We can be more careful about not actually writing to CR3 + unless its value is actually changed. +2. Allow PTI to be enabled/disabled at runtime in addition to the + boot-time switching. + +Testing +======== + +To test stability of PTI, the following test procedure is recommended, +ideally doing all of these in parallel: + +1. Set CONFIG_DEBUG_ENTRY=y +2. Run several copies of all of the tools/testing/selftests/x86/ tests + (excluding MPX and protection_keys) in a loop on multiple CPUs for + several minutes. These tests frequently uncover corner cases in the + kernel entry code. In general, old kernels might cause these tests + themselves to crash, but they should never crash the kernel. +3. Run the 'perf' tool in a mode (top or record) that generates many + frequent performance monitoring non-maskable interrupts (see "NMI" + in /proc/interrupts). This exercises the NMI entry/exit code which + is known to trigger bugs in code paths that did not expect to be + interrupted, including nested NMIs. Using "-c" boosts the rate of + NMIs, and using two -c with separate counters encourages nested NMIs + and less deterministic behavior. + + while true; do perf record -c 10000 -e instructions,cycles -a sleep 10; done + +4. Launch a KVM virtual machine. +5. Run 32-bit binaries on systems supporting the SYSCALL instruction. + This has been a lightly-tested code path and needs extra scrutiny. + +Debugging +========= + +Bugs in PTI cause a few different signatures of crashes +that are worth noting here. + + * Failures of the selftests/x86 code. Usually a bug in one of the + more obscure corners of entry_64.S + * Crashes in early boot, especially around CPU bringup. Bugs + in the trampoline code or mappings cause these. + * Crashes at the first interrupt. Caused by bugs in entry_64.S, + like screwing up a page table switch. Also caused by + incorrectly mapping the IRQ handler entry code. + * Crashes at the first NMI. The NMI code is separate from main + interrupt handlers and can have bugs that do not affect + normal interrupts. Also caused by incorrectly mapping NMI + code. NMIs that interrupt the entry code must be very + careful and can be the cause of crashes that show up when + running perf. + * Kernel crashes at the first exit to userspace. entry_64.S + bugs, or failing to map some of the exit code. + * Crashes at first interrupt that interrupts userspace. The paths + in entry_64.S that return to userspace are sometimes separate + from the ones that return to the kernel. + * Double faults: overflowing the kernel stack because of page + faults upon page faults. Caused by touching non-pti-mapped + data in the entry code, or forgetting to switch to kernel + CR3 before calling into C functions which are not pti-mapped. + * Userspace segfaults early in boot, sometimes manifesting + as mount(8) failing to mount the rootfs. These have + tended to be TLB invalidation issues. Usually invalidating + the wrong PCID, or otherwise missing an invalidation. + +1. https://gruss.cc/files/kaiser.pdf +2. https://meltdownattack.com/meltdown.pdf diff --git a/Makefile b/Makefile index a67c5179052a..4951305eb867 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 4 PATCHLEVEL = 14 -SUBLEVEL = 13 +SUBLEVEL = 14 EXTRAVERSION = NAME = Petit Gorille diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c index c5ff6bfe2825..2f2d176396aa 100644 --- a/arch/mips/kernel/process.c +++ b/arch/mips/kernel/process.c @@ -705,6 +705,18 @@ int mips_set_process_fp_mode(struct task_struct *task, unsigned int value) struct task_struct *t; int max_users; + /* If nothing to change, return right away, successfully. */ + if (value == mips_get_process_fp_mode(task)) + return 0; + + /* Only accept a mode change if 64-bit FP enabled for o32. */ + if (!IS_ENABLED(CONFIG_MIPS_O32_FP64_SUPPORT)) + return -EOPNOTSUPP; + + /* And only for o32 tasks. */ + if (IS_ENABLED(CONFIG_64BIT) && !test_thread_flag(TIF_32BIT_REGS)) + return -EOPNOTSUPP; + /* Check the value is valid */ if (value & ~known_bits) return -EOPNOTSUPP; diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c index 5a09c2901a76..c552c20237d4 100644 --- a/arch/mips/kernel/ptrace.c +++ b/arch/mips/kernel/ptrace.c @@ -410,63 +410,160 @@ static int gpr64_set(struct task_struct *target, #endif /* CONFIG_64BIT */ +/* + * Copy the floating-point context to the supplied NT_PRFPREG buffer, + * !CONFIG_CPU_HAS_MSA variant. FP context's general register slots + * correspond 1:1 to buffer slots. Only general registers are copied. + */ +static int fpr_get_fpa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + void **kbuf, void __user **ubuf) +{ + return user_regset_copyout(pos, count, kbuf, ubuf, + &target->thread.fpu, + 0, NUM_FPU_REGS * sizeof(elf_fpreg_t)); +} + +/* + * Copy the floating-point context to the supplied NT_PRFPREG buffer, + * CONFIG_CPU_HAS_MSA variant. Only lower 64 bits of FP context's + * general register slots are copied to buffer slots. Only general + * registers are copied. + */ +static int fpr_get_msa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + void **kbuf, void __user **ubuf) +{ + unsigned int i; + u64 fpr_val; + int err; + + BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); + for (i = 0; i < NUM_FPU_REGS; i++) { + fpr_val = get_fpr64(&target->thread.fpu.fpr[i], 0); + err = user_regset_copyout(pos, count, kbuf, ubuf, + &fpr_val, i * sizeof(elf_fpreg_t), + (i + 1) * sizeof(elf_fpreg_t)); + if (err) + return err; + } + + return 0; +} + +/* + * Copy the floating-point context to the supplied NT_PRFPREG buffer. + * Choose the appropriate helper for general registers, and then copy + * the FCSR register separately. + */ static int fpr_get(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, void *kbuf, void __user *ubuf) { - unsigned i; + const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t); int err; - u64 fpr_val; - /* XXX fcr31 */ + if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t)) + err = fpr_get_fpa(target, &pos, &count, &kbuf, &ubuf); + else + err = fpr_get_msa(target, &pos, &count, &kbuf, &ubuf); + if (err) + return err; - if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t)) - return user_regset_copyout(&pos, &count, &kbuf, &ubuf, - &target->thread.fpu, - 0, sizeof(elf_fpregset_t)); + err = user_regset_copyout(&pos, &count, &kbuf, &ubuf, + &target->thread.fpu.fcr31, + fcr31_pos, fcr31_pos + sizeof(u32)); - for (i = 0; i < NUM_FPU_REGS; i++) { - fpr_val = get_fpr64(&target->thread.fpu.fpr[i], 0); - err = user_regset_copyout(&pos, &count, &kbuf, &ubuf, - &fpr_val, i * sizeof(elf_fpreg_t), - (i + 1) * sizeof(elf_fpreg_t)); + return err; +} + +/* + * Copy the supplied NT_PRFPREG buffer to the floating-point context, + * !CONFIG_CPU_HAS_MSA variant. Buffer slots correspond 1:1 to FP + * context's general register slots. Only general registers are copied. + */ +static int fpr_set_fpa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + const void **kbuf, const void __user **ubuf) +{ + return user_regset_copyin(pos, count, kbuf, ubuf, + &target->thread.fpu, + 0, NUM_FPU_REGS * sizeof(elf_fpreg_t)); +} + +/* + * Copy the supplied NT_PRFPREG buffer to the floating-point context, + * CONFIG_CPU_HAS_MSA variant. Buffer slots are copied to lower 64 + * bits only of FP context's general register slots. Only general + * registers are copied. + */ +static int fpr_set_msa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + const void **kbuf, const void __user **ubuf) +{ + unsigned int i; + u64 fpr_val; + int err; + + BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); + for (i = 0; i < NUM_FPU_REGS && *count > 0; i++) { + err = user_regset_copyin(pos, count, kbuf, ubuf, + &fpr_val, i * sizeof(elf_fpreg_t), + (i + 1) * sizeof(elf_fpreg_t)); if (err) return err; + set_fpr64(&target->thread.fpu.fpr[i], 0, fpr_val); } return 0; } +/* + * Copy the supplied NT_PRFPREG buffer to the floating-point context. + * Choose the appropriate helper for general registers, and then copy + * the FCSR register separately. + * + * We optimize for the case where `count % sizeof(elf_fpreg_t) == 0', + * which is supposed to have been guaranteed by the kernel before + * calling us, e.g. in `ptrace_regset'. We enforce that requirement, + * so that we can safely avoid preinitializing temporaries for + * partial register writes. + */ static int fpr_set(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, const void *kbuf, const void __user *ubuf) { - unsigned i; + const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t); + u32 fcr31; int err; - u64 fpr_val; - /* XXX fcr31 */ + BUG_ON(count % sizeof(elf_fpreg_t)); + + if (pos + count > sizeof(elf_fpregset_t)) + return -EIO; init_fp_ctx(target); - if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t)) - return user_regset_copyin(&pos, &count, &kbuf, &ubuf, - &target->thread.fpu, - 0, sizeof(elf_fpregset_t)); + if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t)) + err = fpr_set_fpa(target, &pos, &count, &kbuf, &ubuf); + else + err = fpr_set_msa(target, &pos, &count, &kbuf, &ubuf); + if (err) + return err; - BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); - for (i = 0; i < NUM_FPU_REGS && count >= sizeof(elf_fpreg_t); i++) { + if (count > 0) { err = user_regset_copyin(&pos, &count, &kbuf, &ubuf, - &fpr_val, i * sizeof(elf_fpreg_t), - (i + 1) * sizeof(elf_fpreg_t)); + &fcr31, + fcr31_pos, fcr31_pos + sizeof(u32)); if (err) return err; - set_fpr64(&target->thread.fpu.fpr[i], 0, fpr_val); + + ptrace_setfcr31(target, fcr31); } - return 0; + return err; } enum mips_regset { diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c index 29ebe2fd5867..a93d719edc90 100644 --- a/arch/powerpc/kvm/book3s_64_mmu.c +++ b/arch/powerpc/kvm/book3s_64_mmu.c @@ -235,6 +235,7 @@ static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu *vcpu, gva_t eaddr, gpte->may_read = true; gpte->may_write = true; gpte->page_size = MMU_PAGE_4K; + gpte->wimg = HPTE_R_M; return 0; } diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 59247af5fd45..2645d484e945 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -65,11 +65,17 @@ struct kvm_resize_hpt { u32 order; /* These fields protected by kvm->lock */ + + /* Possible values and their usage: + * <0 an error occurred during allocation, + * -EBUSY allocation is in the progress, + * 0 allocation made successfuly. + */ int error; - bool prepare_done; - /* Private to the work thread, until prepare_done is true, - * then protected by kvm->resize_hpt_sem */ + /* Private to the work thread, until error != -EBUSY, + * then protected by kvm->lock. + */ struct kvm_hpt_info hpt; }; @@ -159,8 +165,6 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, int order) * Reset all the reverse-mapping chains for all memslots */ kvmppc_rmap_reset(kvm); - /* Ensure that each vcpu will flush its TLB on next entry. */ - cpumask_setall(&kvm->arch.need_tlb_flush); err = 0; goto out; } @@ -176,6 +180,10 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, int order) kvmppc_set_hpt(kvm, &info); out: + if (err == 0) + /* Ensure that each vcpu will flush its TLB on next entry. */ + cpumask_setall(&kvm->arch.need_tlb_flush); + mutex_unlock(&kvm->lock); return err; } @@ -1424,16 +1432,20 @@ static void resize_hpt_pivot(struct kvm_resize_hpt *resize) static void resize_hpt_release(struct kvm *kvm, struct kvm_resize_hpt *resize) { - BUG_ON(kvm->arch.resize_hpt != resize); + if (WARN_ON(!mutex_is_locked(&kvm->lock))) + return; if (!resize) return; - if (resize->hpt.virt) - kvmppc_free_hpt(&resize->hpt); + if (resize->error != -EBUSY) { + if (resize->hpt.virt) + kvmppc_free_hpt(&resize->hpt); + kfree(resize); + } - kvm->arch.resize_hpt = NULL; - kfree(resize); + if (kvm->arch.resize_hpt == resize) + kvm->arch.resize_hpt = NULL; } static void resize_hpt_prepare_work(struct work_struct *work) @@ -1442,17 +1454,41 @@ static void resize_hpt_prepare_work(struct work_struct *work) struct kvm_resize_hpt, work); struct kvm *kvm = resize->kvm; - int err; + int err = 0; - resize_hpt_debug(resize, "resize_hpt_prepare_work(): order = %d\n", - resize->order); - - err = resize_hpt_allocate(resize); + if (WARN_ON(resize->error != -EBUSY)) + return; mutex_lock(&kvm->lock); + /* Request is still current? */ + if (kvm->arch.resize_hpt == resize) { + /* We may request large allocations here: + * do not sleep with kvm->lock held for a while. + */ + mutex_unlock(&kvm->lock); + + resize_hpt_debug(resize, "resize_hpt_prepare_work(): order = %d\n", + resize->order); + + err = resize_hpt_allocate(resize); + + /* We have strict assumption about -EBUSY + * when preparing for HPT resize. + */ + if (WARN_ON(err == -EBUSY)) + err = -EINPROGRESS; + + mutex_lock(&kvm->lock); + /* It is possible that kvm->arch.resize_hpt != resize + * after we grab kvm->lock again. + */ + } + resize->error = err; - resize->prepare_done = true; + + if (kvm->arch.resize_hpt != resize) + resize_hpt_release(kvm, resize); mutex_unlock(&kvm->lock); } @@ -1477,14 +1513,12 @@ long kvm_vm_ioctl_resize_hpt_prepare(struct kvm *kvm, if (resize) { if (resize->order == shift) { - /* Suitable resize in progress */ - if (resize->prepare_done) { - ret = resize->error; - if (ret != 0) - resize_hpt_release(kvm, resize); - } else { + /* Suitable resize in progress? */ + ret = resize->error; + if (ret == -EBUSY) ret = 100; /* estimated time in ms */ - } + else if (ret) + resize_hpt_release(kvm, resize); goto out; } @@ -1504,6 +1538,8 @@ long kvm_vm_ioctl_resize_hpt_prepare(struct kvm *kvm, ret = -ENOMEM; goto out; } + + resize->error = -EBUSY; resize->order = shift; resize->kvm = kvm; INIT_WORK(&resize->work, resize_hpt_prepare_work); @@ -1558,16 +1594,12 @@ long kvm_vm_ioctl_resize_hpt_commit(struct kvm *kvm, if (!resize || (resize->order != shift)) goto out; - ret = -EBUSY; - if (!resize->prepare_done) - goto out; - ret = resize->error; - if (ret != 0) + if (ret) goto out; ret = resize_hpt_rehash(resize); - if (ret != 0) + if (ret) goto out; resize_hpt_pivot(resize); diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index 69a09444d46e..e2ef16198456 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -60,6 +60,7 @@ static void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong fac); #define MSR_USER32 MSR_USER #define MSR_USER64 MSR_USER #define HW_PAGE_SIZE PAGE_SIZE +#define HPTE_R_M _PAGE_COHERENT #endif static bool kvmppc_is_split_real(struct kvm_vcpu *vcpu) @@ -557,6 +558,7 @@ int kvmppc_handle_pagefault(struct kvm_run *run, struct kvm_vcpu *vcpu, pte.eaddr = eaddr; pte.vpage = eaddr >> 12; pte.page_size = MMU_PAGE_64K; + pte.wimg = HPTE_R_M; } switch (kvmppc_get_msr(vcpu) & (MSR_DR|MSR_IR)) { diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 592c974d4558..17de6acc0eab 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -89,6 +89,7 @@ config X86 select GENERIC_CLOCKEVENTS_MIN_ADJUST select GENERIC_CMOS_UPDATE select GENERIC_CPU_AUTOPROBE + select GENERIC_CPU_VULNERABILITIES select GENERIC_EARLY_IOREMAP select GENERIC_FIND_FIRST_BIT select GENERIC_IOMAP @@ -428,6 +429,19 @@ config GOLDFISH def_bool y depends on X86_GOLDFISH +config RETPOLINE + bool "Avoid speculative indirect branches in kernel" + default y + help + Compile kernel with the retpoline compiler options to guard against + kernel-to-user data leaks by avoiding speculative indirect + branches. Requires a compiler with -mindirect-branch=thunk-extern + support for full protection. The kernel may run slower. + + Without compiler support, at least indirect branches in assembler + code are eliminated. Since this includes the syscall entry path, + it is not entirely pointless. + config INTEL_RDT bool "Intel Resource Director Technology support" default n diff --git a/arch/x86/Makefile b/arch/x86/Makefile index a20eacd9c7e9..504b1a4535ac 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -235,6 +235,14 @@ KBUILD_CFLAGS += -Wno-sign-compare # KBUILD_CFLAGS += -fno-asynchronous-unwind-tables +# Avoid indirect branches in kernel to deal with Spectre +ifdef CONFIG_RETPOLINE + RETPOLINE_CFLAGS += $(call cc-option,-mindirect-branch=thunk-extern -mindirect-branch-register) + ifneq ($(RETPOLINE_CFLAGS),) + KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE + endif +endif + archscripts: scripts_basic $(Q)$(MAKE) $(build)=arch/x86/tools relocs diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S index 16627fec80b2..3d09e3aca18d 100644 --- a/arch/x86/crypto/aesni-intel_asm.S +++ b/arch/x86/crypto/aesni-intel_asm.S @@ -32,6 +32,7 @@ #include <linux/linkage.h> #include <asm/inst.h> #include <asm/frame.h> +#include <asm/nospec-branch.h> /* * The following macros are used to move an (un)aligned 16 byte value to/from @@ -2884,7 +2885,7 @@ ENTRY(aesni_xts_crypt8) pxor INC, STATE4 movdqu IV, 0x30(OUTP) - call *%r11 + CALL_NOSPEC %r11 movdqu 0x00(OUTP), INC pxor INC, STATE1 @@ -2929,7 +2930,7 @@ ENTRY(aesni_xts_crypt8) _aesni_gf128mul_x_ble() movups IV, (IVP) - call *%r11 + CALL_NOSPEC %r11 movdqu 0x40(OUTP), INC pxor INC, STATE1 diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S index f7c495e2863c..a14af6eb09cb 100644 --- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S +++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S @@ -17,6 +17,7 @@ #include <linux/linkage.h> #include <asm/frame.h> +#include <asm/nospec-branch.h> #define CAMELLIA_TABLE_BYTE_LEN 272 @@ -1227,7 +1228,7 @@ camellia_xts_crypt_16way: vpxor 14 * 16(%rax), %xmm15, %xmm14; vpxor 15 * 16(%rax), %xmm15, %xmm15; - call *%r9; + CALL_NOSPEC %r9; addq $(16 * 16), %rsp; diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S index eee5b3982cfd..b66bbfa62f50 100644 --- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S +++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S @@ -12,6 +12,7 @@ #include <linux/linkage.h> #include <asm/frame.h> +#include <asm/nospec-branch.h> #define CAMELLIA_TABLE_BYTE_LEN 272 @@ -1343,7 +1344,7 @@ camellia_xts_crypt_32way: vpxor 14 * 32(%rax), %ymm15, %ymm14; vpxor 15 * 32(%rax), %ymm15, %ymm15; - call *%r9; + CALL_NOSPEC %r9; addq $(16 * 32), %rsp; diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S index 7a7de27c6f41..d9b734d0c8cc 100644 --- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S +++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S @@ -45,6 +45,7 @@ #include <asm/inst.h> #include <linux/linkage.h> +#include <asm/nospec-branch.h> ## ISCSI CRC 32 Implementation with crc32 and pclmulqdq Instruction @@ -172,7 +173,7 @@ continue_block: movzxw (bufp, %rax, 2), len lea crc_array(%rip), bufp lea (bufp, len, 1), bufp - jmp *bufp + JMP_NOSPEC bufp ################################################################ ## 2a) PROCESS FULL BLOCKS: diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h index 45a63e00a6af..3f48f695d5e6 100644 --- a/arch/x86/entry/calling.h +++ b/arch/x86/entry/calling.h @@ -198,8 +198,11 @@ For 32-bit we have the following conventions - kernel is built with * PAGE_TABLE_ISOLATION PGDs are 8k. Flip bit 12 to switch between the two * halves: */ -#define PTI_SWITCH_PGTABLES_MASK (1<<PAGE_SHIFT) -#define PTI_SWITCH_MASK (PTI_SWITCH_PGTABLES_MASK|(1<<X86_CR3_PTI_SWITCH_BIT)) +#define PTI_USER_PGTABLE_BIT PAGE_SHIFT +#define PTI_USER_PGTABLE_MASK (1 << PTI_USER_PGTABLE_BIT) +#define PTI_USER_PCID_BIT X86_CR3_PTI_PCID_USER_BIT +#define PTI_USER_PCID_MASK (1 << PTI_USER_PCID_BIT) +#define PTI_USER_PGTABLE_AND_PCID_MASK (PTI_USER_PCID_MASK | PTI_USER_PGTABLE_MASK) .macro SET_NOFLUSH_BIT reg:req bts $X86_CR3_PCID_NOFLUSH_BIT, \reg @@ -208,7 +211,7 @@ For 32-bit we have the following conventions - kernel is built with .macro ADJUST_KERNEL_CR3 reg:req ALTERNATIVE "", "SET_NOFLUSH_BIT \reg", X86_FEATURE_PCID /* Clear PCID and "PAGE_TABLE_ISOLATION bit", point CR3 at kernel pagetables: */ - andq $(~PTI_SWITCH_MASK), \reg + andq $(~PTI_USER_PGTABLE_AND_PCID_MASK), \reg .endm .macro SWITCH_TO_KERNEL_CR3 scratch_reg:req @@ -239,15 +242,19 @@ For 32-bit we have the following conventions - kernel is built with /* Flush needed, clear the bit */ btr \scratch_reg, THIS_CPU_user_pcid_flush_mask movq \scratch_reg2, \scratch_reg - jmp .Lwrcr3_\@ + jmp .Lwrcr3_pcid_\@ .Lnoflush_\@: movq \scratch_reg2, \scratch_reg SET_NOFLUSH_BIT \scratch_reg +.Lwrcr3_pcid_\@: + /* Flip the ASID to the user version */ + orq $(PTI_USER_PCID_MASK), \scratch_reg + .Lwrcr3_\@: - /* Flip the PGD and ASID to the user version */ - orq $(PTI_SWITCH_MASK), \scratch_reg + /* Flip the PGD to the user version */ + orq $(PTI_USER_PGTABLE_MASK), \scratch_reg mov \scratch_reg, %cr3 .Lend_\@: .endm @@ -263,17 +270,12 @@ For 32-bit we have the following conventions - kernel is built with movq %cr3, \scratch_reg movq \scratch_reg, \save_reg /* - * Is the "switch mask" all zero? That means that both of - * these are zero: - * - * 1. The user/kernel PCID bit, and - * 2. The user/kernel "bit" that points CR3 to the - * bottom half of the 8k PGD - * - * That indicates a kernel CR3 value, not a user CR3. + * Test the user pagetable bit. If set, then the user page tables + * are active. If clear CR3 already has the kernel page table + * active. */ - testq $(PTI_SWITCH_MASK), \scratch_reg - jz .Ldone_\@ + bt $PTI_USER_PGTABLE_BIT, \scratch_reg + jnc .Ldone_\@ ADJUST_KERNEL_CR3 \scratch_reg movq \scratch_reg, %cr3 @@ -290,7 +292,7 @@ For 32-bit we have the following conventions - kernel is built with * KERNEL pages can always resume with NOFLUSH as we do * explicit flushes. */ - bt $X86_CR3_PTI_SWITCH_BIT, \save_reg + bt $PTI_USER_PGTABLE_BIT, \save_reg jnc .Lnoflush_\@ /* diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index ace8f321a5a1..a1f28a54f23a 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -44,6 +44,7 @@ #include <asm/asm.h> #include <asm/smap.h> #include <asm/frame.h> +#include <asm/nospec-branch.h> .section .entry.text, "ax" @@ -290,7 +291,7 @@ ENTRY(ret_from_fork) /* kernel thread */ 1: movl %edi, %eax - call *%ebx + CALL_NOSPEC %ebx /* * A kernel thread is allowed to return here after successfully * calling do_execve(). Exit to userspace to complete the execve() @@ -919,7 +920,7 @@ common_exception: movl %ecx, %es TRACE_IRQS_OFF movl %esp, %eax # pt_regs pointer - call *%edi + CALL_NOSPEC %edi jmp ret_from_exception END(common_exception) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index dd696b966e58..f5fda5f26e34 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -37,6 +37,7 @@ #include <asm/pgtable_types.h> #include <asm/export.h> #include <asm/frame.h> +#include <asm/nospec-branch.h> #include <linux/err.h> #include "calling.h" @@ -187,7 +188,7 @@ ENTRY(entry_SYSCALL_64_trampoline) */ pushq %rdi movq $entry_SYSCALL_64_stage2, %rdi - jmp *%rdi + JMP_NOSPEC %rdi END(entry_SYSCALL_64_trampoline) .popsection @@ -266,7 +267,12 @@ entry_SYSCALL_64_fastpath: * It might end up jumping to the slow path. If it jumps, RAX * and all argument registers are clobbered. */ +#ifdef CONFIG_RETPOLINE + movq sys_call_table(, %rax, 8), %rax + call __x86_indirect_thunk_rax +#else call *sys_call_table(, %rax, 8) +#endif .Lentry_SYSCALL_64_after_fastpath_call: movq %rax, RAX(%rsp) @@ -438,7 +444,7 @@ ENTRY(stub_ptregs_64) jmp entry_SYSCALL64_slow_path 1: - jmp *%rax /* Called from C */ + JMP_NOSPEC %rax /* Called from C */ END(stub_ptregs_64) .macro ptregs_stub func @@ -517,7 +523,7 @@ ENTRY(ret_from_fork) 1: /* kernel thread */ movq %r12, %rdi - call *%rbx + CALL_NOSPEC %rbx /* * A kernel thread is allowed to return here after successfully * calling do_execve(). Exit to userspace to complete the execve() diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c index 141e07b06216..24ffa1e88cf9 100644 --- a/arch/x86/events/intel/bts.c +++ b/arch/x86/events/intel/bts.c @@ -582,6 +582,24 @@ static __init int bts_init(void) if (!boot_cpu_has(X86_FEATURE_DTES64) || !x86_pmu.bts) return -ENODEV; + if (boot_cpu_has(X86_FEATURE_PTI)) { + /* + * BTS hardware writes through a virtual memory map we must + * either use the kernel physical map, or the user mapping of + * the AUX buffer. + * + * However, since this driver supports per-CPU and per-task inherit + * we cannot use the user mapping since it will not be availble + * if we're not running the owning process. + * + * With PTI we can't use the kernal map either, because its not + * there when we run userspace. + * + * For now, disable this driver when using PTI. + */ + return -ENODEV; + } + bts_pmu.capabilities = PERF_PMU_CAP_AUX_NO_SG | PERF_PMU_CAP_ITRACE | PERF_PMU_CAP_EXCLUSIVE; bts_pmu.task_ctx_nr = perf_sw_context; diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h index ff700d81e91e..0927cdc4f946 100644 --- a/arch/x86/include/asm/asm-prototypes.h +++ b/arch/x86/include/asm/asm-prototypes.h @@ -11,7 +11,32 @@ #include <asm/pgtable.h> #include <asm/special_insns.h> #include <asm/preempt.h> +#include <asm/asm.h> #ifndef CONFIG_X86_CMPXCHG64 extern void cmpxchg8b_emu(void); #endif + +#ifdef CONFIG_RETPOLINE +#ifdef CONFIG_X86_32 +#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_e ## reg(void); +#else +#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_r ## reg(void); +INDIRECT_THUNK(8) +INDIRECT_THUNK(9) +INDIRECT_THUNK(10) +INDIRECT_THUNK(11) +INDIRECT_THUNK(12) +INDIRECT_THUNK(13) +INDIRECT_THUNK(14) +INDIRECT_THUNK(15) +#endif +INDIRECT_THUNK(ax) +INDIRECT_THUNK(bx) +INDIRECT_THUNK(cx) +INDIRECT_THUNK(dx) +INDIRECT_THUNK(si) +INDIRECT_THUNK(di) +INDIRECT_THUNK(bp) +INDIRECT_THUNK(sp) +#endif /* CONFIG_RETPOLINE */ diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 21ac898df2d8..f275447862f4 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -203,6 +203,8 @@ #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */ #define X86_FEATURE_SME ( 7*32+10) /* AMD Secure Memory Encryption */ #define X86_FEATURE_PTI ( 7*32+11) /* Kernel Page Table Isolation enabled */ +#define X86_FEATURE_RETPOLINE ( 7*32+12) /* Generic Retpoline mitigation for Spectre variant 2 */ +#define X86_FEATURE_RETPOLINE_AMD ( 7*32+13) /* AMD Retpoline mitigation for Spectre variant 2 */ #define X86_FEATURE_INTEL_PPIN ( 7*32+14) /* Intel Processor Inventory Number */ #define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */ #define X86_FEATURE_AVX512_4VNNIW ( 7*32+16) /* AVX-512 Neural Network Instructions */ @@ -342,5 +344,7 @@ #define X86_BUG_MONITOR X86_BUG(12) /* IPI required to wake up remote CPU */ #define X86_BUG_AMD_E400 X86_BUG(13) /* CPU is among the affected by Erratum 400 */ #define X86_BUG_CPU_MELTDOWN X86_BUG(14) /* CPU is affected by meltdown attack and needs kernel page table isolation */ +#define X86_BUG_SPECTRE_V1 X86_BUG(15) /* CPU is affected by Spectre variant 1 attack with conditional branches */ +#define X86_BUG_SPECTRE_V2 X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */ #endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index 581bb54dd464..5119e4b555cc 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -7,6 +7,7 @@ #include <linux/nmi.h> #include <asm/io.h> #include <asm/hyperv.h> +#include <asm/nospec-branch.h> /* * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent @@ -186,10 +187,11 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output) return U64_MAX; __asm__ __volatile__("mov %4, %%r8\n" - "call *%5" + CALL_NOSPEC : "=a" (hv_status), ASM_CALL_CONSTRAINT, "+c" (control), "+d" (input_address) - : "r" (output_address), "m" (hv_hypercall_pg) + : "r" (output_address), + THUNK_TARGET(hv_hypercall_pg) : "cc", "memory", "r8", "r9", "r10", "r11"); #else u32 input_address_hi = upper_32_bits(input_address); @@ -200,13 +202,13 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output) if (!hv_hypercall_pg) return U64_MAX; - __asm__ __volatile__("call *%7" + __asm__ __volatile__(CALL_NOSPEC : "=A" (hv_status), "+c" (input_address_lo), ASM_CALL_CONSTRAINT : "A" (control), "b" (input_address_hi), "D"(output_address_hi), "S"(output_address_lo), - "m" (hv_hypercall_pg) + THUNK_TARGET(hv_hypercall_pg) : "cc", "memory"); #endif /* !x86_64 */ return hv_status; @@ -227,10 +229,10 @@ static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1) #ifdef CONFIG_X86_64 { - __asm__ __volatile__("call *%4" + __asm__ __volatile__(CALL_NOSPEC : "=a" (hv_status), ASM_CALL_CONSTRAINT, "+c" (control), "+d" (input1) - : "m" (hv_hypercall_pg) + : THUNK_TARGET(hv_hypercall_pg) : "cc", "r8", "r9", "r10", "r11"); } #else @@ -238,13 +240,13 @@ static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1) u32 input1_hi = upper_32_bits(input1); u32 input1_lo = lower_32_bits(input1); - __asm__ __volatile__ ("call *%5" + __asm__ __volatile__ (CALL_NOSPEC : "=A"(hv_status), "+c"(input1_lo), ASM_CALL_CONSTRAINT : "A" (control), "b" (input1_hi), - "m" (hv_hypercall_pg) + THUNK_TARGET(hv_hypercall_pg) : "cc", "edi", "esi"); } #endif diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index ab022618a50a..fa11fb1fa570 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -352,6 +352,9 @@ #define FAM10H_MMIO_CONF_BASE_MASK 0xfffffffULL #define FAM10H_MMIO_CONF_BASE_SHIFT 20 #define MSR_FAM10H_NODE_ID 0xc001100c +#define MSR_F10H_DECFG 0xc0011029 +#define MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT 1 +#define MSR_F10H_DECFG_LFENCE_SERIALIZE BIT_ULL(MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT) /* K8 MSRs */ #define MSR_K8_TOP_MEM1 0xc001001a diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h new file mode 100644 index 000000000000..402a11c803c3 --- /dev/null +++ b/arch/x86/include/asm/nospec-branch.h @@ -0,0 +1,214 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __NOSPEC_BRANCH_H__ +#define __NOSPEC_BRANCH_H__ + +#include <asm/alternative.h> +#include <asm/alternative-asm.h> +#include <asm/cpufeatures.h> + +/* + * Fill the CPU return stack buffer. + * + * Each entry in the RSB, if used for a speculative 'ret', contains an + * infinite 'pause; jmp' loop to capture speculative execution. + * + * This is required in various cases for retpoline and IBRS-based + * mitigations for the Spectre variant 2 vulnerability. Sometimes to + * eliminate potentially bogus entries from the RSB, and sometimes + * purely to ensure that it doesn't get empty, which on some CPUs would + * allow predictions from other (unwanted!) sources to be used. + * + * We define a CPP macro such that it can be used from both .S files and + * inline assembly. It's possible to do a .macro and then include that + * from C via asm(".include <asm/nospec-branch.h>") but let's not go there. + */ + +#define RSB_CLEAR_LOOPS 32 /* To forcibly overwrite all entries */ +#define RSB_FILL_LOOPS 16 /* To avoid underflow */ + +/* + * Google experimented with loop-unrolling and this turned out to be + * the optimal version — two calls, each with their own speculation + * trap should their return address end up getting used, in a loop. + */ +#define __FILL_RETURN_BUFFER(reg, nr, sp) \ + mov $(nr/2), reg; \ +771: \ + call 772f; \ +773: /* speculation trap */ \ + pause; \ + jmp 773b; \ +772: \ + call 774f; \ +775: /* speculation trap */ \ + pause; \ + jmp 775b; \ +774: \ + dec reg; \ + jnz 771b; \ + add $(BITS_PER_LONG/8) * nr, sp; + +#ifdef __ASSEMBLY__ + +/* + * This should be used immediately before a retpoline alternative. It tells + * objtool where the retpolines are so that it can make sense of the control + * flow by just reading the original instruction(s) and ignoring the + * alternatives. + */ +.macro ANNOTATE_NOSPEC_ALTERNATIVE + .Lannotate_\@: + .pushsection .discard.nospec + .long .Lannotate_\@ - . + .popsection +.endm + +/* + * These are the bare retpoline primitives for indirect jmp and call. + * Do not use these directly; they only exist to make the ALTERNATIVE + * invocation below less ugly. + */ +.macro RETPOLINE_JMP reg:req + call .Ldo_rop_\@ +.Lspec_trap_\@: + pause + jmp .Lspec_trap_\@ +.Ldo_rop_\@: + mov \reg, (%_ASM_SP) + ret +.endm + +/* + * This is a wrapper around RETPOLINE_JMP so the called function in reg + * returns to the instruction after the macro. + */ +.macro RETPOLINE_CALL reg:req + jmp .Ldo_call_\@ +.Ldo_retpoline_jmp_\@: + RETPOLINE_JMP \reg +.Ldo_call_\@: + call .Ldo_retpoline_jmp_\@ +.endm + +/* + * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple + * indirect jmp/call which may be susceptible to the Spectre variant 2 + * attack. + */ +.macro JMP_NOSPEC reg:req +#ifdef CONFIG_RETPOLINE + ANNOTATE_NOSPEC_ALTERNATIVE + ALTERNATIVE_2 __stringify(jmp *\reg), \ + __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ + __stringify(lfence; jmp *\reg), X86_FEATURE_RETPOLINE_AMD +#else + jmp *\reg +#endif +.endm + +.macro CALL_NOSPEC reg:req +#ifdef CONFIG_RETPOLINE + ANNOTATE_NOSPEC_ALTERNATIVE + ALTERNATIVE_2 __stringify(call *\reg), \ + __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\ + __stringify(lfence; call *\reg), X86_FEATURE_RETPOLINE_AMD +#else + call *\reg +#endif +.endm + + /* + * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP + * monstrosity above, manually. + */ +.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req +#ifdef CONFIG_RETPOLINE + ANNOTATE_NOSPEC_ALTERNATIVE + ALTERNATIVE "jmp .Lskip_rsb_\@", \ + __stringify(__FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)) \ + \ftr +.Lskip_rsb_\@: +#endif +.endm + +#else /* __ASSEMBLY__ */ + +#define ANNOTATE_NOSPEC_ALTERNATIVE \ + "999:\n\t" \ + ".pushsection .discard.nospec\n\t" \ + ".long 999b - .\n\t" \ + ".popsection\n\t" + +#if defined(CONFIG_X86_64) && defined(RETPOLINE) + +/* + * Since the inline asm uses the %V modifier which is only in newer GCC, + * the 64-bit one is dependent on RETPOLINE not CONFIG_RETPOLINE. + */ +# define CALL_NOSPEC \ + ANNOTATE_NOSPEC_ALTERNATIVE \ + ALTERNATIVE( \ + "call *%[thunk_target]\n", \ + "call __x86_indirect_thunk_%V[thunk_target]\n", \ + X86_FEATURE_RETPOLINE) +# define THUNK_TARGET(addr) [thunk_target] "r" (addr) + +#elif defined(CONFIG_X86_32) && defined(CONFIG_RETPOLINE) +/* + * For i386 we use the original ret-equivalent retpoline, because + * otherwise we'll run out of registers. We don't care about CET + * here, anyway. + */ +# define CALL_NOSPEC ALTERNATIVE("call *%[thunk_target]\n", \ + " jmp 904f;\n" \ + " .align 16\n" \ + "901: call 903f;\n" \ + "902: pause;\n" \ + " jmp 902b;\n" \ + " .align 16\n" \ + "903: addl $4, %%esp;\n" \ + " pushl %[thunk_target];\n" \ + " ret;\n" \ + " .align 16\n" \ + "904: call 901b;\n", \ + X86_FEATURE_RETPOLINE) + +# define THUNK_TARGET(addr) [thunk_target] "rm" (addr) +#else /* No retpoline for C / inline asm */ +# define CALL_NOSPEC "call *%[thunk_target]\n" +# define THUNK_TARGET(addr) [thunk_target] "rm" (addr) +#endif + +/* The Spectre V2 mitigation variants */ +enum spectre_v2_mitigation { + SPECTRE_V2_NONE, + SPECTRE_V2_RETPOLINE_MINIMAL, + SPECTRE_V2_RETPOLINE_MINIMAL_AMD, + SPECTRE_V2_RETPOLINE_GENERIC, + SPECTRE_V2_RETPOLINE_AMD, + SPECTRE_V2_IBRS, +}; + +/* + * On VMEXIT we must ensure that no RSB predictions learned in the guest + * can be followed in the host, by overwriting the RSB completely. Both + * retpoline and IBRS mitigations for Spectre v2 need this; only on future + * CPUs with IBRS_ATT *might* it be avoided. + */ +static inline void vmexit_fill_RSB(void) +{ +#ifdef CONFIG_RETPOLINE + unsigned long loops = RSB_CLEAR_LOOPS / 2; + + asm volatile (ANNOTATE_NOSPEC_ALTERNATIVE + ALTERNATIVE("jmp 910f", + __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)), + X86_FEATURE_RETPOLINE) + "910:" + : "=&r" (loops), ASM_CALL_CONSTRAINT + : "r" (loops) : "memory" ); +#endif +} +#endif /* __ASSEMBLY__ */ +#endif /* __NOSPEC_BRANCH_H__ */ diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/processor-flags.h index 6a60fea90b9d..625a52a5594f 100644 --- a/arch/x86/include/asm/processor-flags.h +++ b/arch/x86/include/asm/processor-flags.h @@ -40,7 +40,7 @@ #define CR3_NOFLUSH BIT_ULL(63) #ifdef CONFIG_PAGE_TABLE_ISOLATION -# define X86_CR3_PTI_SWITCH_BIT 11 +# define X86_CR3_PTI_PCID_USER_BIT 11 #endif #else diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index f9b48ce152eb..3effd3c994af 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -81,13 +81,13 @@ static inline u16 kern_pcid(u16 asid) * Make sure that the dynamic ASID space does not confict with the * bit we are using to switch between user and kernel ASIDs. */ - BUILD_BUG_ON(TLB_NR_DYN_ASIDS >= (1 << X86_CR3_PTI_SWITCH_BIT)); + BUILD_BUG_ON(TLB_NR_DYN_ASIDS >= (1 << X86_CR3_PTI_PCID_USER_BIT)); /* * The ASID being passed in here should have respected the * MAX_ASID_AVAILABLE and thus never have the switch bit set. */ - VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_SWITCH_BIT)); + VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_PCID_USER_BIT)); #endif /* * The dynamically-assigned ASIDs that get passed in are small @@ -112,7 +112,7 @@ static inline u16 user_pcid(u16 asid) { u16 ret = kern_pcid(asid); #ifdef CONFIG_PAGE_TABLE_ISOLATION - ret |= 1 << X86_CR3_PTI_SWITCH_BIT; + ret |= 1 << X86_CR3_PTI_PCID_USER_BIT; #endif return ret; } diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h index 7cb282e9e587..bfd882617613 100644 --- a/arch/x86/include/asm/xen/hypercall.h +++ b/arch/x86/include/asm/xen/hypercall.h @@ -44,6 +44,7 @@ #include <asm/page.h> #include <asm/pgtable.h> #include <asm/smap.h> +#include <asm/nospec-branch.h> #include <xen/interface/xen.h> #include <xen/interface/sched.h> @@ -217,9 +218,9 @@ privcmd_call(unsigned call, __HYPERCALL_5ARG(a1, a2, a3, a4, a5); stac(); - asm volatile("call *%[call]" + asm volatile(CALL_NOSPEC : __HYPERCALL_5PARAM - : [call] "a" (&hypercall_page[call]) + : [thunk_target] "a" (&hypercall_page[call]) : __HYPERCALL_CLOBBER5); clac(); diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index 079535e53e2a..9c2a002d9297 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -342,13 +342,12 @@ acpi_parse_lapic_nmi(struct acpi_subtable_header * header, const unsigned long e #ifdef CONFIG_X86_IO_APIC #define MP_ISA_BUS 0 +static int __init mp_register_ioapic_irq(u8 bus_irq, u8 polarity, + u8 trigger, u32 gsi); + static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, u32 gsi) { - int ioapic; - int pin; - struct mpc_intsrc mp_irq; - /* * Check bus_irq boundary. */ @@ -357,14 +356,6 @@ static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, return; } - /* - * Convert 'gsi' to 'ioapic.pin'. - */ - ioapic = mp_find_ioapic(gsi); - if (ioapic < 0) - return; - pin = mp_find_ioapic_pin(ioapic, gsi); - /* * TBD: This check is for faulty timer entries, where the override * erroneously sets the trigger to level, resulting in a HUGE @@ -373,16 +364,8 @@ static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, if ((bus_irq == 0) && (trigger == 3)) trigger = 1; - mp_irq.type = MP_INTSRC; - mp_irq.irqtype = mp_INT; - mp_irq.irqflag = (trigger << 2) | polarity; - mp_irq.srcbus = MP_ISA_BUS; - mp_irq.srcbusirq = bus_irq; /* IRQ */ - mp_irq.dstapic = mpc_ioapic_id(ioapic); /* APIC ID */ - mp_irq.dstirq = pin; /* INTIN# */ - - mp_save_irq(&mp_irq); - + if (mp_register_ioapic_irq(bus_irq, polarity, trigger, gsi) < 0) + return; /* * Reset default identity mapping if gsi is also an legacy IRQ, * otherwise there will be more than one entry with the same GSI @@ -429,6 +412,34 @@ static int mp_config_acpi_gsi(struct device *dev, u32 gsi, int trigger, return 0; } +static int __init mp_register_ioapic_irq(u8 bus_irq, u8 polarity, + u8 trigger, u32 gsi) +{ + struct mpc_intsrc mp_irq; + int ioapic, pin; + + /* Convert 'gsi' to 'ioapic.pin'(INTIN#) */ + ioapic = mp_find_ioapic(gsi); + if (ioapic < 0) { + pr_warn("Failed to find ioapic for gsi : %u\n", gsi); + return ioapic; + } + + pin = mp_find_ioapic_pin(ioapic, gsi); + + mp_irq.type = MP_INTSRC; + mp_irq.irqtype = mp_INT; + mp_irq.irqflag = (trigger << 2) | polarity; + mp_irq.srcbus = MP_ISA_BUS; + mp_irq.srcbusirq = bus_irq; + mp_irq.dstapic = mpc_ioapic_id(ioapic); + mp_irq.dstirq = pin; + + mp_save_irq(&mp_irq); + + return 0; +} + static int __init acpi_parse_ioapic(struct acpi_subtable_header * header, const unsigned long end) { @@ -473,7 +484,11 @@ static void __init acpi_sci_ioapic_setup(u8 bus_irq, u16 polarity, u16 trigger, if (acpi_sci_flags & ACPI_MADT_POLARITY_MASK) polarity = acpi_sci_flags & ACPI_MADT_POLARITY_MASK; - mp_override_legacy_irq(bus_irq, polarity, trigger, gsi); + if (bus_irq < NR_IRQS_LEGACY) + mp_override_legacy_irq(bus_irq, polarity, trigger, gsi); + else + mp_register_ioapic_irq(bus_irq, polarity, trigger, gsi); + acpi_penalize_sci_irq(bus_irq, trigger, polarity); /* diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 3344d3382e91..e0b97e4d1db5 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -344,9 +344,12 @@ recompute_jump(struct alt_instr *a, u8 *orig_insn, u8 *repl_insn, u8 *insnbuf) static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr) { unsigned long flags; + int i; - if (instr[0] != 0x90) - return; + for (i = 0; i < a->padlen; i++) { + if (instr[i] != 0x90) + return; + } local_irq_save(flags); add_nops(instr + (a->instrlen - a->padlen), a->padlen); diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index bcb75dc97d44..ea831c858195 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -829,8 +829,32 @@ static void init_amd(struct cpuinfo_x86 *c) set_cpu_cap(c, X86_FEATURE_K8); if (cpu_has(c, X86_FEATURE_XMM2)) { - /* MFENCE stops RDTSC speculation */ - set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC); + unsigned long long val; + int ret; + + /* + * A serializing LFENCE has less overhead than MFENCE, so + * use it for execution serialization. On families which + * don't have that MSR, LFENCE is already serializing. + * msr_set_bit() uses the safe accessors, too, even if the MSR + * is not present. + */ + msr_set_bit(MSR_F10H_DECFG, + MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT); + + /* + * Verify that the MSR write was successful (could be running + * under a hypervisor) and only then assume that LFENCE is + * serializing. + */ + ret = rdmsrl_safe(MSR_F10H_DECFG, &val); + if (!ret && (val & MSR_F10H_DECFG_LFENCE_SERIALIZE)) { + /* A serializing LFENCE stops RDTSC speculation */ + set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC); + } else { + /* MFENCE stops RDTSC speculation */ + set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC); + } } /* diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index ba0b2424c9b0..e4dc26185aa7 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -10,6 +10,10 @@ */ #include <linux/init.h> #include <linux/utsname.h> +#include <linux/cpu.h> + +#include <asm/nospec-branch.h> +#include <asm/cmdline.h> #include <asm/bugs.h> #include <asm/processor.h> #include <asm/processor-flags.h> @@ -20,6 +24,8 @@ #include <asm/pgtable.h> #include <asm/set_memory.h> +static void __init spectre_v2_select_mitigation(void); + void __init check_bugs(void) { identify_boot_cpu(); @@ -29,6 +35,9 @@ void __init check_bugs(void) print_cpu_info(&boot_cpu_data); } + /* Select the proper spectre mitigation before patching alternatives */ + spectre_v2_select_mitigation(); + #ifdef CONFIG_X86_32 /* * Check whether we are able to run this kernel safely on SMP. @@ -60,3 +69,179 @@ void __init check_bugs(void) set_memory_4k((unsigned long)__va(0), 1); #endif } + +/* The kernel command line selection */ +enum spectre_v2_mitigation_cmd { + SPECTRE_V2_CMD_NONE, + SPECTRE_V2_CMD_AUTO, + SPECTRE_V2_CMD_FORCE, + SPECTRE_V2_CMD_RETPOLINE, + SPECTRE_V2_CMD_RETPOLINE_GENERIC, + SPECTRE_V2_CMD_RETPOLINE_AMD, +}; + +static const char *spectre_v2_strings[] = { + [SPECTRE_V2_NONE] = "Vulnerable", + [SPECTRE_V2_RETPOLINE_MINIMAL] = "Vulnerable: Minimal generic ASM retpoline", + [SPECTRE_V2_RETPOLINE_MINIMAL_AMD] = "Vulnerable: Minimal AMD ASM retpoline", + [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline", + [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline", +}; + +#undef pr_fmt +#define pr_fmt(fmt) "Spectre V2 mitigation: " fmt + +static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE; + +static void __init spec2_print_if_insecure(const char *reason) +{ + if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) + pr_info("%s\n", reason); +} + +static void __init spec2_print_if_secure(const char *reason) +{ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) + pr_info("%s\n", reason); +} + +static inline bool retp_compiler(void) +{ + return __is_defined(RETPOLINE); +} + +static inline bool match_option(const char *arg, int arglen, const char *opt) +{ + int len = strlen(opt); + + return len == arglen && !strncmp(arg, opt, len); +} + +static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) +{ + char arg[20]; + int ret; + + ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, + sizeof(arg)); + if (ret > 0) { + if (match_option(arg, ret, "off")) { + goto disable; + } else if (match_option(arg, ret, "on")) { + spec2_print_if_secure("force enabled on command line."); + return SPECTRE_V2_CMD_FORCE; + } else if (match_option(arg, ret, "retpoline")) { + spec2_print_if_insecure("retpoline selected on command line."); + return SPECTRE_V2_CMD_RETPOLINE; + } else if (match_option(arg, ret, "retpoline,amd")) { + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) { + pr_err("retpoline,amd selected but CPU is not AMD. Switching to AUTO select\n"); + return SPECTRE_V2_CMD_AUTO; + } + spec2_print_if_insecure("AMD retpoline selected on command line."); + return SPECTRE_V2_CMD_RETPOLINE_AMD; + } else if (match_option(arg, ret, "retpoline,generic")) { + spec2_print_if_insecure("generic retpoline selected on command line."); + return SPECTRE_V2_CMD_RETPOLINE_GENERIC; + } else if (match_option(arg, ret, "auto")) { + return SPECTRE_V2_CMD_AUTO; + } + } + + if (!cmdline_find_option_bool(boot_command_line, "nospectre_v2")) + return SPECTRE_V2_CMD_AUTO; +disable: + spec2_print_if_insecure("disabled on command line."); + return SPECTRE_V2_CMD_NONE; +} + +static void __init spectre_v2_select_mitigation(void) +{ + enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline(); + enum spectre_v2_mitigation mode = SPECTRE_V2_NONE; + + /* + * If the CPU is not affected and the command line mode is NONE or AUTO + * then nothing to do. + */ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2) && + (cmd == SPECTRE_V2_CMD_NONE || cmd == SPECTRE_V2_CMD_AUTO)) + return; + + switch (cmd) { + case SPECTRE_V2_CMD_NONE: + return; + + case SPECTRE_V2_CMD_FORCE: + /* FALLTRHU */ + case SPECTRE_V2_CMD_AUTO: + goto retpoline_auto; + + case SPECTRE_V2_CMD_RETPOLINE_AMD: + if (IS_ENABLED(CONFIG_RETPOLINE)) + goto retpoline_amd; + break; + case SPECTRE_V2_CMD_RETPOLINE_GENERIC: + if (IS_ENABLED(CONFIG_RETPOLINE)) + goto retpoline_generic; + break; + case SPECTRE_V2_CMD_RETPOLINE: + if (IS_ENABLED(CONFIG_RETPOLINE)) + goto retpoline_auto; + break; + } + pr_err("kernel not compiled with retpoline; no mitigation available!"); + return; + +retpoline_auto: + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { + retpoline_amd: + if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) { + pr_err("LFENCE not serializing. Switching to generic retpoline\n"); + goto retpoline_generic; + } + mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_AMD : + SPECTRE_V2_RETPOLINE_MINIMAL_AMD; + setup_force_cpu_cap(X86_FEATURE_RETPOLINE_AMD); + setup_force_cpu_cap(X86_FEATURE_RETPOLINE); + } else { + retpoline_generic: + mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_GENERIC : + SPECTRE_V2_RETPOLINE_MINIMAL; + setup_force_cpu_cap(X86_FEATURE_RETPOLINE); + } + + spectre_v2_enabled = mode; + pr_info("%s\n", spectre_v2_strings[mode]); +} + +#undef pr_fmt + +#ifdef CONFIG_SYSFS +ssize_t cpu_show_meltdown(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN)) + return sprintf(buf, "Not affected\n"); + if (boot_cpu_has(X86_FEATURE_PTI)) + return sprintf(buf, "Mitigation: PTI\n"); + return sprintf(buf, "Vulnerable\n"); +} + +ssize_t cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1)) + return sprintf(buf, "Not affected\n"); + return sprintf(buf, "Vulnerable\n"); +} + +ssize_t cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) + return sprintf(buf, "Not affected\n"); + + return sprintf(buf, "%s\n", spectre_v2_strings[spectre_v2_enabled]); +} +#endif diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 2d3bd2215e5b..372ba3fb400f 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -902,6 +902,9 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c) if (c->x86_vendor != X86_VENDOR_AMD) setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN); + setup_force_cpu_bug(X86_BUG_SPECTRE_V1); + setup_force_cpu_bug(X86_BUG_SPECTRE_V2); + fpu__init_system(c); #ifdef CONFIG_X86_32 diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c index 8ccdca6d3f9e..d9e460fc7a3b 100644 --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -910,8 +910,17 @@ static bool is_blacklisted(unsigned int cpu) { struct cpuinfo_x86 *c = &cpu_data(cpu); - if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X) { - pr_err_once("late loading on model 79 is disabled.\n"); + /* + * Late loading on model 79 with microcode revision less than 0x0b000021 + * may result in a system hang. This behavior is documented in item + * BDF90, #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family). + */ + if (c->x86 == 6 && + c->x86_model == INTEL_FAM6_BROADWELL_X && + c->x86_mask == 0x01 && + c->microcode < 0x0b000021) { + pr_err_once("Erratum BDF90: late loading with revision < 0x0b000021 (0x%x) disabled.\n", c->microcode); + pr_err_once("Please consider either early loading through initrd/built-in or a potential BIOS update.\n"); return true; } diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S index b6c6468e10bc..4c8440de3355 100644 --- a/arch/x86/kernel/ftrace_32.S +++ b/arch/x86/kernel/ftrace_32.S @@ -8,6 +8,7 @@ #include <asm/segment.h> #include <asm/export.h> #include <asm/ftrace.h> +#include <asm/nospec-branch.h> #ifdef CC_USING_FENTRY # define function_hook __fentry__ @@ -197,7 +198,8 @@ ftrace_stub: movl 0x4(%ebp), %edx subl $MCOUNT_INSN_SIZE, %eax - call *ftrace_trace_function + movl ftrace_trace_function, %ecx + CALL_NOSPEC %ecx popl %edx popl %ecx @@ -241,5 +243,5 @@ return_to_handler: movl %eax, %ecx popl %edx popl %eax - jmp *%ecx + JMP_NOSPEC %ecx #endif diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S index c832291d948a..7cb8ba08beb9 100644 --- a/arch/x86/kernel/ftrace_64.S +++ b/arch/x86/kernel/ftrace_64.S @@ -7,7 +7,7 @@ #include <asm/ptrace.h> #include <asm/ftrace.h> #include <asm/export.h> - +#include <asm/nospec-branch.h> .code64 .section .entry.text, "ax" @@ -286,8 +286,8 @@ trace: * ip and parent ip are used and the list function is called when * function tracing is enabled. */ - call *ftrace_trace_function - + movq ftrace_trace_function, %r8 + CALL_NOSPEC %r8 restore_mcount_regs jmp fgraph_trace @@ -329,5 +329,5 @@ GLOBAL(return_to_handler) movq 8(%rsp), %rdx movq (%rsp), %rax addq $24, %rsp - jmp *%rdi + JMP_NOSPEC %rdi #endif diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c index a83b3346a0e1..c1bdbd3d3232 100644 --- a/arch/x86/kernel/irq_32.c +++ b/arch/x86/kernel/irq_32.c @@ -20,6 +20,7 @@ #include <linux/mm.h> #include <asm/apic.h> +#include <asm/nospec-branch.h> #ifdef CONFIG_DEBUG_STACKOVERFLOW @@ -55,11 +56,11 @@ DEFINE_PER_CPU(struct irq_stack *, softirq_stack); static void call_on_stack(void *func, void *stack) { asm volatile("xchgl %%ebx,%%esp \n" - "call *%%edi \n" + CALL_NOSPEC "movl %%ebx,%%esp \n" : "=b" (stack) : "0" (stack), - "D"(func) + [thunk_target] "D"(func) : "memory", "cc", "edx", "ecx", "eax"); } @@ -95,11 +96,11 @@ static inline int execute_on_irq_stack(int overflow, struct irq_desc *desc) call_on_stack(print_stack_overflow, isp); asm volatile("xchgl %%ebx,%%esp \n" - "call *%%edi \n" + CALL_NOSPEC "movl %%ebx,%%esp \n" : "=a" (arg1), "=b" (isp) : "0" (desc), "1" (isp), - "D" (desc->handle_irq) + [thunk_target] "D" (desc->handle_irq) : "memory", "cc", "ecx"); return 1; } diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c index a4eb27918ceb..a2486f444073 100644 --- a/arch/x86/kernel/tboot.c +++ b/arch/x86/kernel/tboot.c @@ -138,6 +138,17 @@ static int map_tboot_page(unsigned long vaddr, unsigned long pfn, return -1; set_pte_at(&tboot_mm, vaddr, pte, pfn_pte(pfn, prot)); pte_unmap(pte); + + /* + * PTI poisons low addresses in the kernel page tables in the + * name of making them unusable for userspace. To execute + * code at such a low address, the poison must be cleared. + * + * Note: 'pgd' actually gets set in p4d_alloc() _or_ + * pud_alloc() depending on 4/5-level paging. + */ + pgd->pgd &= ~_PAGE_NX; + return 0; } diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 17fb6c6d939a..6a8284f72328 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -45,6 +45,7 @@ #include <asm/debugreg.h> #include <asm/kvm_para.h> #include <asm/irq_remapping.h> +#include <asm/nospec-branch.h> #include <asm/virtext.h> #include "trace.h" @@ -4964,6 +4965,25 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) "mov %%r13, %c[r13](%[svm]) \n\t" "mov %%r14, %c[r14](%[svm]) \n\t" "mov %%r15, %c[r15](%[svm]) \n\t" +#endif + /* + * Clear host registers marked as clobbered to prevent + * speculative use. + */ + "xor %%" _ASM_BX ", %%" _ASM_BX " \n\t" + "xor %%" _ASM_CX ", %%" _ASM_CX " \n\t" + "xor %%" _ASM_DX ", %%" _ASM_DX " \n\t" + "xor %%" _ASM_SI ", %%" _ASM_SI " \n\t" + "xor %%" _ASM_DI ", %%" _ASM_DI " \n\t" +#ifdef CONFIG_X86_64 + "xor %%r8, %%r8 \n\t" + "xor %%r9, %%r9 \n\t" + "xor %%r10, %%r10 \n\t" + "xor %%r11, %%r11 \n\t" + "xor %%r12, %%r12 \n\t" + "xor %%r13, %%r13 \n\t" + "xor %%r14, %%r14 \n\t" + "xor %%r15, %%r15 \n\t" #endif "pop %%" _ASM_BP : @@ -4994,6 +5014,9 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) #endif ); + /* Eliminate branch target predictions from guest mode */ + vmexit_fill_RSB(); + #ifdef CONFIG_X86_64 wrmsrl(MSR_GS_BASE, svm->host.gs_base); #else diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 47d9432756f3..ef16cf0f7cfd 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -50,6 +50,7 @@ #include <asm/apic.h> #include <asm/irq_remapping.h> #include <asm/mmu_context.h> +#include <asm/nospec-branch.h> #include "trace.h" #include "pmu.h" @@ -888,8 +889,16 @@ static inline short vmcs_field_to_offset(unsigned long field) { BUILD_BUG_ON(ARRAY_SIZE(vmcs_field_to_offset_table) > SHRT_MAX); - if (field >= ARRAY_SIZE(vmcs_field_to_offset_table) || - vmcs_field_to_offset_table[field] == 0) + if (field >= ARRAY_SIZE(vmcs_field_to_offset_table)) + return -ENOENT; + + /* + * FIXME: Mitigation for CVE-2017-5753. To be replaced with a + * generic mechanism. + */ + asm("lfence"); + + if (vmcs_field_to_offset_table[field] == 0) return -ENOENT; return vmcs_field_to_offset_table[field]; @@ -9405,6 +9414,7 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) /* Save guest registers, load host registers, keep flags */ "mov %0, %c[wordsize](%%" _ASM_SP ") \n\t" "pop %0 \n\t" + "setbe %c[fail](%0)\n\t" "mov %%" _ASM_AX ", %c[rax](%0) \n\t" "mov %%" _ASM_BX ", %c[rbx](%0) \n\t" __ASM_SIZE(pop) " %c[rcx](%0) \n\t" @@ -9421,12 +9431,23 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) "mov %%r13, %c[r13](%0) \n\t" "mov %%r14, %c[r14](%0) \n\t" "mov %%r15, %c[r15](%0) \n\t" + "xor %%r8d, %%r8d \n\t" + "xor %%r9d, %%r9d \n\t" + "xor %%r10d, %%r10d \n\t" + "xor %%r11d, %%r11d \n\t" + "xor %%r12d, %%r12d \n\t" + "xor %%r13d, %%r13d \n\t" + "xor %%r14d, %%r14d \n\t" + "xor %%r15d, %%r15d \n\t" #endif "mov %%cr2, %%" _ASM_AX " \n\t" "mov %%" _ASM_AX ", %c[cr2](%0) \n\t" + "xor %%eax, %%eax \n\t" + "xor %%ebx, %%ebx \n\t" + "xor %%esi, %%esi \n\t" + "xor %%edi, %%edi \n\t" "pop %%" _ASM_BP "; pop %%" _ASM_DX " \n\t" - "setbe %c[fail](%0) \n\t" ".pushsection .rodata \n\t" ".global vmx_return \n\t" "vmx_return: " _ASM_PTR " 2b \n\t" @@ -9463,6 +9484,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) #endif ); + /* Eliminate branch target predictions from guest mode */ + vmexit_fill_RSB(); + /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */ if (debugctlmsr) update_debugctlmsr(debugctlmsr); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 075619a92ce7..575c8953cc9a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4362,7 +4362,7 @@ static int vcpu_mmio_read(struct kvm_vcpu *vcpu, gpa_t addr, int len, void *v) addr, n, v)) && kvm_io_bus_read(vcpu, KVM_MMIO_BUS, addr, n, v)) break; - trace_kvm_mmio(KVM_TRACE_MMIO_READ, n, addr, *(u64 *)v); + trace_kvm_mmio(KVM_TRACE_MMIO_READ, n, addr, v); handled += n; addr += n; len -= n; @@ -4621,7 +4621,7 @@ static int read_prepare(struct kvm_vcpu *vcpu, void *val, int bytes) { if (vcpu->mmio_read_completed) { trace_kvm_mmio(KVM_TRACE_MMIO_READ, bytes, - vcpu->mmio_fragments[0].gpa, *(u64 *)val); + vcpu->mmio_fragments[0].gpa, val); vcpu->mmio_read_completed = 0; return 1; } @@ -4643,14 +4643,14 @@ static int write_emulate(struct kvm_vcpu *vcpu, gpa_t gpa, static int write_mmio(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes, void *val) { - trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, bytes, gpa, *(u64 *)val); + trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, bytes, gpa, val); return vcpu_mmio_write(vcpu, gpa, bytes, val); } static int read_exit_mmio(struct kvm_vcpu *vcpu, gpa_t gpa, void *val, int bytes) { - trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, bytes, gpa, 0); + trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, bytes, gpa, NULL); return X86EMUL_IO_NEEDED; } diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile index 457f681ef379..d435c89875c1 100644 --- a/arch/x86/lib/Makefile +++ b/arch/x86/lib/Makefile @@ -26,6 +26,7 @@ lib-y += memcpy_$(BITS).o lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o +lib-$(CONFIG_RETPOLINE) += retpoline.o obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S index 4d34bb548b41..46e71a74e612 100644 --- a/arch/x86/lib/checksum_32.S +++ b/arch/x86/lib/checksum_32.S @@ -29,7 +29,8 @@ #include <asm/errno.h> #include <asm/asm.h> #include <asm/export.h> - +#include <asm/nospec-branch.h> + /* * computes a partial checksum, e.g. for TCP/UDP fragments */ @@ -156,7 +157,7 @@ ENTRY(csum_partial) negl %ebx lea 45f(%ebx,%ebx,2), %ebx testl %esi, %esi - jmp *%ebx + JMP_NOSPEC %ebx # Handle 2-byte-aligned regions 20: addw (%esi), %ax @@ -439,7 +440,7 @@ ENTRY(csum_partial_copy_generic) andl $-32,%edx lea 3f(%ebx,%ebx), %ebx testl %esi, %esi - jmp *%ebx + JMP_NOSPEC %ebx 1: addl $64,%esi addl $64,%edi SRC(movb -32(%edx),%bl) ; SRC(movb (%edx),%bl) diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S new file mode 100644 index 000000000000..cb45c6cb465f --- /dev/null +++ b/arch/x86/lib/retpoline.S @@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#include <linux/stringify.h> +#include <linux/linkage.h> +#include <asm/dwarf2.h> +#include <asm/cpufeatures.h> +#include <asm/alternative-asm.h> +#include <asm/export.h> +#include <asm/nospec-branch.h> + +.macro THUNK reg + .section .text.__x86.indirect_thunk.\reg + +ENTRY(__x86_indirect_thunk_\reg) + CFI_STARTPROC + JMP_NOSPEC %\reg + CFI_ENDPROC +ENDPROC(__x86_indirect_thunk_\reg) +.endm + +/* + * Despite being an assembler file we can't just use .irp here + * because __KSYM_DEPS__ only uses the C preprocessor and would + * only see one instance of "__x86_indirect_thunk_\reg" rather + * than one per register with the correct names. So we do it + * the simple and nasty way... + */ +#define EXPORT_THUNK(reg) EXPORT_SYMBOL(__x86_indirect_thunk_ ## reg) +#define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg) + +GENERATE_THUNK(_ASM_AX) +GENERATE_THUNK(_ASM_BX) +GENERATE_THUNK(_ASM_CX) +GENERATE_THUNK(_ASM_DX) +GENERATE_THUNK(_ASM_SI) +GENERATE_THUNK(_ASM_DI) +GENERATE_THUNK(_ASM_BP) +GENERATE_THUNK(_ASM_SP) +#ifdef CONFIG_64BIT +GENERATE_THUNK(r8) +GENERATE_THUNK(r9) +GENERATE_THUNK(r10) +GENERATE_THUNK(r11) +GENERATE_THUNK(r12) +GENERATE_THUNK(r13) +GENERATE_THUNK(r14) +GENERATE_THUNK(r15) +#endif diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c index 43d4a4a29037..ce38f165489b 100644 --- a/arch/x86/mm/pti.c +++ b/arch/x86/mm/pti.c @@ -149,7 +149,7 @@ pgd_t __pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd) * * Returns a pointer to a P4D on success, or NULL on failure. */ -static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address) +static __init p4d_t *pti_user_pagetable_walk_p4d(unsigned long address) { pgd_t *pgd = kernel_to_user_pgdp(pgd_offset_k(address)); gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO); @@ -164,12 +164,7 @@ static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address) if (!new_p4d_page) return NULL; - if (pgd_none(*pgd)) { - set_pgd(pgd, __pgd(_KERNPG_TABLE | __pa(new_p4d_page))); - new_p4d_page = 0; - } - if (new_p4d_page) - free_page(new_p4d_page); + set_pgd(pgd, __pgd(_KERNPG_TABLE | __pa(new_p4d_page))); } BUILD_BUG_ON(pgd_large(*pgd) != 0); @@ -182,7 +177,7 @@ static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address) * * Returns a pointer to a PMD on success, or NULL on failure. */ -static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address) +static __init pmd_t *pti_user_pagetable_walk_pmd(unsigned long address) { gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO); p4d_t *p4d = pti_user_pagetable_walk_p4d(address); @@ -194,12 +189,7 @@ static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address) if (!new_pud_page) return NULL; - if (p4d_none(*p4d)) { - set_p4d(p4d, __p4d(_KERNPG_TABLE | __pa(new_pud_page))); - new_pud_page = 0; - } - if (new_pud_page) - free_page(new_pud_page); + set_p4d(p4d, __p4d(_KERNPG_TABLE | __pa(new_pud_page))); } pud = pud_offset(p4d, address); @@ -213,12 +203,7 @@ static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address) if (!new_pmd_page) return NULL; - if (pud_none(*pud)) { - set_pud(pud, __pud(_KERNPG_TABLE | __pa(new_pmd_page))); - new_pmd_page = 0; - } - if (new_pmd_page) - free_page(new_pmd_page); + set_pud(pud, __pud(_KERNPG_TABLE | __pa(new_pmd_page))); } return pmd_offset(pud, address); @@ -251,12 +236,7 @@ static __init pte_t *pti_user_pagetable_walk_pte(unsigned long address) if (!new_pte_page) return NULL; - if (pmd_none(*pmd)) { - set_pmd(pmd, __pmd(_KERNPG_TABLE | __pa(new_pte_page))); - new_pte_page = 0; - } - if (new_pte_page) - free_page(new_pte_page); + set_pmd(pmd, __pmd(_KERNPG_TABLE | __pa(new_pte_page))); } pte = pte_offset_kernel(pmd, address); diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c index 39c4b35ac7a4..61975b6bcb1a 100644 --- a/arch/x86/platform/efi/efi_64.c +++ b/arch/x86/platform/efi/efi_64.c @@ -134,7 +134,9 @@ pgd_t * __init efi_call_phys_prolog(void) pud[j] = *pud_offset(p4d_k, vaddr); } } + pgd_offset_k(pgd * PGDIR_SIZE)->pgd &= ~_PAGE_NX; } + out: __flush_tlb_all(); diff --git a/crypto/algapi.c b/crypto/algapi.c index aa699ff6c876..50eb828db767 100644 --- a/crypto/algapi.c +++ b/crypto/algapi.c @@ -167,6 +167,18 @@ void crypto_remove_spawns(struct crypto_alg *alg, struct list_head *list, spawn->alg = NULL; spawns = &inst->alg.cra_users; + + /* + * We may encounter an unregistered instance here, since + * an instance's spawns are set up prior to the instance + * being registered. An unregistered instance will have + * NULL ->cra_users.next, since ->cra_users isn't + * properly initialized until registration. But an + * unregistered instance cannot have any users, so treat + * it the same as ->cra_users being empty. + */ + if (spawns->next == NULL) + break; } } while ((spawns = crypto_more_spawns(alg, &stack, &top, &secondary_spawns))); diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index bdc87907d6a1..2415ad9f6dd4 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -236,6 +236,9 @@ config GENERIC_CPU_DEVICES config GENERIC_CPU_AUTOPROBE bool +config GENERIC_CPU_VULNERABILITIES + bool + config SOC_BUS bool select GLOB diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index 321cd7b4d817..825964efda1d 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -501,10 +501,58 @@ static void __init cpu_dev_register_generic(void) #endif } +#ifdef CONFIG_GENERIC_CPU_VULNERABILITIES + +ssize_t __weak cpu_show_meltdown(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + +ssize_t __weak cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + +ssize_t __weak cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + +static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); +static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); +static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); + +static struct attribute *cpu_root_vulnerabilities_attrs[] = { + &dev_attr_meltdown.attr, + &dev_attr_spectre_v1.attr, + &dev_attr_spectre_v2.attr, + NULL +}; + +static const struct attribute_group cpu_root_vulnerabilities_group = { + .name = "vulnerabilities", + .attrs = cpu_root_vulnerabilities_attrs, +}; + +static void __init cpu_register_vulnerabilities(void) +{ + if (sysfs_create_group(&cpu_subsys.dev_root->kobj, + &cpu_root_vulnerabilities_group)) + pr_err("Unable to register CPU vulnerabilities\n"); +} + +#else +static inline void cpu_register_vulnerabilities(void) { } +#endif + void __init cpu_dev_init(void) { if (subsys_system_register(&cpu_subsys, cpu_root_attr_groups)) panic("Failed to register CPU subsystem"); cpu_dev_register_generic(); + cpu_register_vulnerabilities(); } diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index adc877dfef5c..609227211295 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -3074,13 +3074,21 @@ static void format_lock_cookie(struct rbd_device *rbd_dev, char *buf) mutex_unlock(&rbd_dev->watch_mutex); } +static void __rbd_lock(struct rbd_device *rbd_dev, const char *cookie) +{ + struct rbd_client_id cid = rbd_get_cid(rbd_dev); + + strcpy(rbd_dev->lock_cookie, cookie); + rbd_set_owner_cid(rbd_dev, &cid); + queue_work(rbd_dev->task_wq, &rbd_dev->acquired_lock_work); +} + /* * lock_rwsem must be held for write */ static int rbd_lock(struct rbd_device *rbd_dev) { struct ceph_osd_client *osdc = &rbd_dev->rbd_client->client->osdc; - struct rbd_client_id cid = rbd_get_cid(rbd_dev); char cookie[32]; int ret; @@ -3095,9 +3103,7 @@ static int rbd_lock(struct rbd_device *rbd_dev) return ret; rbd_dev->lock_state = RBD_LOCK_STATE_LOCKED; - strcpy(rbd_dev->lock_cookie, cookie); - rbd_set_owner_cid(rbd_dev, &cid); - queue_work(rbd_dev->task_wq, &rbd_dev->acquired_lock_work); + __rbd_lock(rbd_dev, cookie); return 0; } @@ -3883,7 +3889,7 @@ static void rbd_reacquire_lock(struct rbd_device *rbd_dev) queue_delayed_work(rbd_dev->task_wq, &rbd_dev->lock_dwork, 0); } else { - strcpy(rbd_dev->lock_cookie, cookie); + __rbd_lock(rbd_dev, cookie); } } @@ -4415,7 +4421,7 @@ static int rbd_init_disk(struct rbd_device *rbd_dev) segment_size = rbd_obj_bytes(&rbd_dev->header); blk_queue_max_hw_sectors(q, segment_size / SECTOR_SIZE); q->limits.max_sectors = queue_max_hw_sectors(q); - blk_queue_max_segments(q, segment_size / SECTOR_SIZE); + blk_queue_max_segments(q, USHRT_MAX); blk_queue_max_segment_size(q, segment_size); blk_queue_io_min(q, segment_size); blk_queue_io_opt(q, segment_size); diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c index a385838e2919..dadacbe558ab 100644 --- a/drivers/gpu/drm/i915/gvt/gtt.c +++ b/drivers/gpu/drm/i915/gvt/gtt.c @@ -1359,12 +1359,15 @@ static int ppgtt_handle_guest_write_page_table_bytes(void *gp, return ret; } else { if (!test_bit(index, spt->post_shadow_bitmap)) { + int type = spt->shadow_page.type; + ppgtt_get_shadow_entry(spt, &se, index); ret = ppgtt_handle_guest_entry_removal(gpt, &se, index); if (ret) return ret; + ops->set_pfn(&se, vgpu->gtt.scratch_pt[type].page_mfn); + ppgtt_set_shadow_entry(spt, &se, index); } - ppgtt_set_post_shadow(spt, index); } diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 82498f8232eb..5c5cb2ceee49 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1693,6 +1693,7 @@ static int i915_drm_resume(struct drm_device *dev) intel_guc_resume(dev_priv); intel_modeset_init_hw(dev); + intel_init_clock_gating(dev_priv); spin_lock_irq(&dev_priv->irq_lock); if (dev_priv->display.hpd_irq_setup) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index ce2ed16f2a30..920c8914cec1 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -6987,6 +6987,8 @@ enum { #define GEN9_SLICE_COMMON_ECO_CHICKEN0 _MMIO(0x7308) #define DISABLE_PIXEL_MASK_CAMMING (1<<14) +#define GEN9_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) + #define GEN7_L3SQCREG1 _MMIO(0xB010) #define VLV_B0_WA_L3SQCREG1_VALUE 0x00D30000 diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 1c73d5542681..095a2240af4f 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -3800,6 +3800,7 @@ void intel_finish_reset(struct drm_i915_private *dev_priv) intel_pps_unlock_regs_wa(dev_priv); intel_modeset_init_hw(dev); + intel_init_clock_gating(dev_priv); spin_lock_irq(&dev_priv->irq_lock); if (dev_priv->display.hpd_irq_setup) @@ -14406,8 +14407,6 @@ void intel_modeset_init_hw(struct drm_device *dev) intel_update_cdclk(dev_priv); dev_priv->cdclk.logical = dev_priv->cdclk.actual = dev_priv->cdclk.hw; - - intel_init_clock_gating(dev_priv); } /* @@ -15124,6 +15123,15 @@ intel_modeset_setup_hw_state(struct drm_device *dev, struct intel_encoder *encoder; int i; + if (IS_HASWELL(dev_priv)) { + /* + * WaRsPkgCStateDisplayPMReq:hsw + * System hang if this isn't done before disabling all planes! + */ + I915_WRITE(CHICKEN_PAR1_1, + I915_READ(CHICKEN_PAR1_1) | FORCE_ARB_IDLE_PLANES); + } + intel_modeset_readout_hw_state(dev); /* HW state is read out, now we need to sanitize this mess. */ @@ -15220,6 +15228,8 @@ void intel_modeset_gem_init(struct drm_device *dev) intel_init_gt_powersave(dev_priv); + intel_init_clock_gating(dev_priv); + intel_setup_overlay(dev_priv); } diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 3c2d9cf22ed5..b6a7e492c1a3 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -1125,6 +1125,11 @@ static int glk_init_workarounds(struct intel_engine_cs *engine) if (ret) return ret; + /* WA #0862: Userspace has to set "Barrier Mode" to avoid hangs. */ + ret = wa_ring_whitelist_reg(engine, GEN9_SLICE_COMMON_ECO_CHICKEN1); + if (ret) + return ret; + /* WaToEnableHwFixForPushConstHWBug:glk */ WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2, GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION); diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index cb950752c346..014e5c08571a 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -5669,12 +5669,30 @@ void vlv_wm_sanitize(struct drm_i915_private *dev_priv) mutex_unlock(&dev_priv->wm.wm_mutex); } +/* + * FIXME should probably kill this and improve + * the real watermark readout/sanitation instead + */ +static void ilk_init_lp_watermarks(struct drm_i915_private *dev_priv) +{ + I915_WRITE(WM3_LP_ILK, I915_READ(WM3_LP_ILK) & ~WM1_LP_SR_EN); + I915_WRITE(WM2_LP_ILK, I915_READ(WM2_LP_ILK) & ~WM1_LP_SR_EN); + I915_WRITE(WM1_LP_ILK, I915_READ(WM1_LP_ILK) & ~WM1_LP_SR_EN); + + /* + * Don't touch WM1S_LP_EN here. + * Doing so could cause underruns. + */ +} + void ilk_wm_get_hw_state(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); struct ilk_wm_values *hw = &dev_priv->wm.hw; struct drm_crtc *crtc; + ilk_init_lp_watermarks(dev_priv); + for_each_crtc(dev, crtc) ilk_pipe_wm_get_hw_state(crtc); @@ -7959,18 +7977,6 @@ static void g4x_disable_trickle_feed(struct drm_i915_private *dev_priv) } } -static void ilk_init_lp_watermarks(struct drm_i915_private *dev_priv) -{ - I915_WRITE(WM3_LP_ILK, I915_READ(WM3_LP_ILK) & ~WM1_LP_SR_EN); - I915_WRITE(WM2_LP_ILK, I915_READ(WM2_LP_ILK) & ~WM1_LP_SR_EN); - I915_WRITE(WM1_LP_ILK, I915_READ(WM1_LP_ILK) & ~WM1_LP_SR_EN); - - /* - * Don't touch WM1S_LP_EN here. - * Doing so could cause underruns. - */ -} - static void ironlake_init_clock_gating(struct drm_i915_private *dev_priv) { uint32_t dspclk_gate = ILK_VRHUNIT_CLOCK_GATE_DISABLE; @@ -8004,8 +8010,6 @@ static void ironlake_init_clock_gating(struct drm_i915_private *dev_priv) (I915_READ(DISP_ARB_CTL) | DISP_FBC_WM_DIS)); - ilk_init_lp_watermarks(dev_priv); - /* * Based on the document from hardware guys the following bits * should be set unconditionally in order to enable FBC. @@ -8118,8 +8122,6 @@ static void gen6_init_clock_gating(struct drm_i915_private *dev_priv) I915_WRITE(GEN6_GT_MODE, _MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4)); - ilk_init_lp_watermarks(dev_priv); - I915_WRITE(CACHE_MODE_0, _MASKED_BIT_DISABLE(CM0_STC_EVICT_DISABLE_LRA_SNB)); @@ -8293,8 +8295,6 @@ static void broadwell_init_clock_gating(struct drm_i915_private *dev_priv) { enum pipe pipe; - ilk_init_lp_watermarks(dev_priv); - /* WaSwitchSolVfFArbitrationPriority:bdw */ I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL); @@ -8349,8 +8349,6 @@ static void broadwell_init_clock_gating(struct drm_i915_private *dev_priv) static void haswell_init_clock_gating(struct drm_i915_private *dev_priv) { - ilk_init_lp_watermarks(dev_priv); - /* L3 caching of data atomics doesn't work -- disable it. */ I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE); I915_WRITE(HSW_ROW_CHICKEN3, @@ -8394,10 +8392,6 @@ static void haswell_init_clock_gating(struct drm_i915_private *dev_priv) /* WaSwitchSolVfFArbitrationPriority:hsw */ I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL); - /* WaRsPkgCStateDisplayPMReq:hsw */ - I915_WRITE(CHICKEN_PAR1_1, - I915_READ(CHICKEN_PAR1_1) | FORCE_ARB_IDLE_PLANES); - lpt_init_clock_gating(dev_priv); } @@ -8405,8 +8399,6 @@ static void ivybridge_init_clock_gating(struct drm_i915_private *dev_priv) { uint32_t snpcr; - ilk_init_lp_watermarks(dev_priv); - I915_WRITE(ILK_DSPCLK_GATE_D, ILK_VRHUNIT_CLOCK_GATE_DISABLE); /* WaDisableEarlyCull:ivb */ diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c index 21c62a34e558..87e8af5776a3 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c @@ -2731,6 +2731,8 @@ static int vmw_cmd_dx_view_define(struct vmw_private *dev_priv, } view_type = vmw_view_cmd_to_type(header->id); + if (view_type == vmw_view_max) + return -EINVAL; cmd = container_of(header, typeof(*cmd), header); ret = vmw_cmd_res_check(dev_priv, sw_context, vmw_res_surface, user_surface_converter, diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c index b850562fbdd6..62c2f4be8012 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c @@ -697,7 +697,6 @@ vmw_du_plane_duplicate_state(struct drm_plane *plane) vps->pinned = 0; /* Mapping is managed by prepare_fb/cleanup_fb */ - memset(&vps->guest_map, 0, sizeof(vps->guest_map)); memset(&vps->host_map, 0, sizeof(vps->host_map)); vps->cpp = 0; @@ -760,11 +759,6 @@ vmw_du_plane_destroy_state(struct drm_plane *plane, /* Should have been freed by cleanup_fb */ - if (vps->guest_map.virtual) { - DRM_ERROR("Guest mapping not freed\n"); - ttm_bo_kunmap(&vps->guest_map); - } - if (vps->host_map.virtual) { DRM_ERROR("Host mapping not freed\n"); ttm_bo_kunmap(&vps->host_map); diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h index ff9c8389ff21..cd9da2dd79af 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h @@ -175,7 +175,7 @@ struct vmw_plane_state { int pinned; /* For CPU Blit */ - struct ttm_bo_kmap_obj host_map, guest_map; + struct ttm_bo_kmap_obj host_map; unsigned int cpp; }; diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c index ca3afae2db1f..4dee05b15552 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c @@ -114,7 +114,7 @@ struct vmw_screen_target_display_unit { bool defined; /* For CPU Blit */ - struct ttm_bo_kmap_obj host_map, guest_map; + struct ttm_bo_kmap_obj host_map; unsigned int cpp; }; @@ -695,7 +695,8 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty) s32 src_pitch, dst_pitch; u8 *src, *dst; bool not_used; - + struct ttm_bo_kmap_obj guest_map; + int ret; if (!dirty->num_hits) return; @@ -706,6 +707,13 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty) if (width == 0 || height == 0) return; + ret = ttm_bo_kmap(&ddirty->buf->base, 0, ddirty->buf->base.num_pages, + &guest_map); + if (ret) { + DRM_ERROR("Failed mapping framebuffer for blit: %d\n", + ret); + goto out_cleanup; + } /* Assume we are blitting from Host (display_srf) to Guest (dmabuf) */ src_pitch = stdu->display_srf->base_size.width * stdu->cpp; @@ -713,7 +721,7 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty) src += ddirty->top * src_pitch + ddirty->left * stdu->cpp; dst_pitch = ddirty->pitch; - dst = ttm_kmap_obj_virtual(&stdu->guest_map, &not_used); + dst = ttm_kmap_obj_virtual(&guest_map, &not_used); dst += ddirty->fb_top * dst_pitch + ddirty->fb_left * stdu->cpp; @@ -772,6 +780,7 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty) vmw_fifo_commit(dev_priv, sizeof(*cmd)); } + ttm_bo_kunmap(&guest_map); out_cleanup: ddirty->left = ddirty->top = ddirty->fb_left = ddirty->fb_top = S32_MAX; ddirty->right = ddirty->bottom = S32_MIN; @@ -1109,9 +1118,6 @@ vmw_stdu_primary_plane_cleanup_fb(struct drm_plane *plane, { struct vmw_plane_state *vps = vmw_plane_state_to_vps(old_state); - if (vps->guest_map.virtual) - ttm_bo_kunmap(&vps->guest_map); - if (vps->host_map.virtual) ttm_bo_kunmap(&vps->host_map); @@ -1277,33 +1283,11 @@ vmw_stdu_primary_plane_prepare_fb(struct drm_plane *plane, */ if (vps->content_fb_type == SEPARATE_DMA && !(dev_priv->capabilities & SVGA_CAP_3D)) { - - struct vmw_framebuffer_dmabuf *new_vfbd; - - new_vfbd = vmw_framebuffer_to_vfbd(new_fb); - - ret = ttm_bo_reserve(&new_vfbd->buffer->base, false, false, - NULL); - if (ret) - goto out_srf_unpin; - - ret = ttm_bo_kmap(&new_vfbd->buffer->base, 0, - new_vfbd->buffer->base.num_pages, - &vps->guest_map); - - ttm_bo_unreserve(&new_vfbd->buffer->base); - - if (ret) { - DRM_ERROR("Failed to map content buffer to CPU\n"); - goto out_srf_unpin; - } - ret = ttm_bo_kmap(&vps->surf->res.backup->base, 0, vps->surf->res.backup->base.num_pages, &vps->host_map); if (ret) { DRM_ERROR("Failed to map display buffer to CPU\n"); - ttm_bo_kunmap(&vps->guest_map); goto out_srf_unpin; } @@ -1350,7 +1334,6 @@ vmw_stdu_primary_plane_atomic_update(struct drm_plane *plane, stdu->display_srf = vps->surf; stdu->content_fb_type = vps->content_fb_type; stdu->cpp = vps->cpp; - memcpy(&stdu->guest_map, &vps->guest_map, sizeof(vps->guest_map)); memcpy(&stdu->host_map, &vps->host_map, sizeof(vps->host_map)); if (!stdu->defined) diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c index 514c1000ded1..73feeeeb4283 100644 --- a/drivers/infiniband/hw/cxgb4/cq.c +++ b/drivers/infiniband/hw/cxgb4/cq.c @@ -410,7 +410,7 @@ void c4iw_flush_hw_cq(struct c4iw_cq *chp) static int cqe_completes_wr(struct t4_cqe *cqe, struct t4_wq *wq) { - if (CQE_OPCODE(cqe) == C4IW_DRAIN_OPCODE) { + if (DRAIN_CQE(cqe)) { WARN_ONCE(1, "Unexpected DRAIN CQE qp id %u!\n", wq->sq.qid); return 0; } @@ -509,7 +509,7 @@ static int poll_cq(struct t4_wq *wq, struct t4_cq *cq, struct t4_cqe *cqe, /* * Special cqe for drain WR completions... */ - if (CQE_OPCODE(hw_cqe) == C4IW_DRAIN_OPCODE) { + if (DRAIN_CQE(hw_cqe)) { *cookie = CQE_DRAIN_COOKIE(hw_cqe); *cqe = *hw_cqe; goto skip_cqe; @@ -766,9 +766,6 @@ static int c4iw_poll_cq_one(struct c4iw_cq *chp, struct ib_wc *wc) c4iw_invalidate_mr(qhp->rhp, CQE_WRID_FR_STAG(&cqe)); break; - case C4IW_DRAIN_OPCODE: - wc->opcode = IB_WC_SEND; - break; default: pr_err("Unexpected opcode %d in the CQE received for QPID=0x%0x\n", CQE_OPCODE(&cqe), CQE_QPID(&cqe)); diff --git a/drivers/infiniband/hw/cxgb4/ev.c b/drivers/infiniband/hw/cxgb4/ev.c index 8f963df0bffc..9d25298d96fa 100644 --- a/drivers/infiniband/hw/cxgb4/ev.c +++ b/drivers/infiniband/hw/cxgb4/ev.c @@ -109,9 +109,11 @@ static void post_qp_event(struct c4iw_dev *dev, struct c4iw_cq *chp, if (qhp->ibqp.event_handler) (*qhp->ibqp.event_handler)(&event, qhp->ibqp.qp_context); - spin_lock_irqsave(&chp->comp_handler_lock, flag); - (*chp->ibcq.comp_handler)(&chp->ibcq, chp->ibcq.cq_context); - spin_unlock_irqrestore(&chp->comp_handler_lock, flag); + if (t4_clear_cq_armed(&chp->cq)) { + spin_lock_irqsave(&chp->comp_handler_lock, flag); + (*chp->ibcq.comp_handler)(&chp->ibcq, chp->ibcq.cq_context); + spin_unlock_irqrestore(&chp->comp_handler_lock, flag); + } } void c4iw_ev_dispatch(struct c4iw_dev *dev, struct t4_cqe *err_cqe) diff --git a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h index 819a30635d53..20c481115a99 100644 --- a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h +++ b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h @@ -631,8 +631,6 @@ static inline int to_ib_qp_state(int c4iw_qp_state) return IB_QPS_ERR; } -#define C4IW_DRAIN_OPCODE FW_RI_SGE_EC_CR_RETURN - static inline u32 c4iw_ib_to_tpt_access(int a) { return (a & IB_ACCESS_REMOTE_WRITE ? FW_RI_MEM_ACCESS_REM_WRITE : 0) | diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c index e69453665a17..f311ea73c806 100644 --- a/drivers/infiniband/hw/cxgb4/qp.c +++ b/drivers/infiniband/hw/cxgb4/qp.c @@ -794,21 +794,57 @@ static int ring_kernel_rq_db(struct c4iw_qp *qhp, u16 inc) return 0; } -static void complete_sq_drain_wr(struct c4iw_qp *qhp, struct ib_send_wr *wr) +static int ib_to_fw_opcode(int ib_opcode) +{ + int opcode; + + switch (ib_opcode) { + case IB_WR_SEND_WITH_INV: + opcode = FW_RI_SEND_WITH_INV; + break; + case IB_WR_SEND: + opcode = FW_RI_SEND; + break; + case IB_WR_RDMA_WRITE: + opcode = FW_RI_RDMA_WRITE; + break; + case IB_WR_RDMA_READ: + case IB_WR_RDMA_READ_WITH_INV: + opcode = FW_RI_READ_REQ; + break; + case IB_WR_REG_MR: + opcode = FW_RI_FAST_REGISTER; + break; + case IB_WR_LOCAL_INV: + opcode = FW_RI_LOCAL_INV; + break; + default: + opcode = -EINVAL; + } + return opcode; +} + +static int complete_sq_drain_wr(struct c4iw_qp *qhp, struct ib_send_wr *wr) { struct t4_cqe cqe = {}; struct c4iw_cq *schp; unsigned long flag; struct t4_cq *cq; + int opcode; schp = to_c4iw_cq(qhp->ibqp.send_cq); cq = &schp->cq; + opcode = ib_to_fw_opcode(wr->opcode); + if (opcode < 0) + return opcode; + cqe.u.drain_cookie = wr->wr_id; cqe.header = cpu_to_be32(CQE_STATUS_V(T4_ERR_SWFLUSH) | - CQE_OPCODE_V(C4IW_DRAIN_OPCODE) | + CQE_OPCODE_V(opcode) | CQE_TYPE_V(1) | CQE_SWCQE_V(1) | + CQE_DRAIN_V(1) | CQE_QPID_V(qhp->wq.sq.qid)); spin_lock_irqsave(&schp->lock, flag); @@ -817,10 +853,29 @@ static void complete_sq_drain_wr(struct c4iw_qp *qhp, struct ib_send_wr *wr) t4_swcq_produce(cq); spin_unlock_irqrestore(&schp->lock, flag); - spin_lock_irqsave(&schp->comp_handler_lock, flag); - (*schp->ibcq.comp_handler)(&schp->ibcq, - schp->ibcq.cq_context); - spin_unlock_irqrestore(&schp->comp_handler_lock, flag); + if (t4_clear_cq_armed(&schp->cq)) { + spin_lock_irqsave(&schp->comp_handler_lock, flag); + (*schp->ibcq.comp_handler)(&schp->ibcq, + schp->ibcq.cq_context); + spin_unlock_irqrestore(&schp->comp_handler_lock, flag); + } + return 0; +} + +static int complete_sq_drain_wrs(struct c4iw_qp *qhp, struct ib_send_wr *wr, + struct ib_send_wr **bad_wr) +{ + int ret = 0; + + while (wr) { + ret = complete_sq_drain_wr(qhp, wr); + if (ret) { + *bad_wr = wr; + break; + } + wr = wr->next; + } + return ret; } static void complete_rq_drain_wr(struct c4iw_qp *qhp, struct ib_recv_wr *wr) @@ -835,9 +890,10 @@ static void complete_rq_drain_wr(struct c4iw_qp *qhp, struct ib_recv_wr *wr) cqe.u.drain_cookie = wr->wr_id; cqe.header = cpu_to_be32(CQE_STATUS_V(T4_ERR_SWFLUSH) | - CQE_OPCODE_V(C4IW_DRAIN_OPCODE) | + CQE_OPCODE_V(FW_RI_SEND) | CQE_TYPE_V(0) | CQE_SWCQE_V(1) | + CQE_DRAIN_V(1) | CQE_QPID_V(qhp->wq.sq.qid)); spin_lock_irqsave(&rchp->lock, flag); @@ -846,10 +902,20 @@ static void complete_rq_drain_wr(struct c4iw_qp *qhp, struct ib_recv_wr *wr) t4_swcq_produce(cq); spin_unlock_irqrestore(&rchp->lock, flag); - spin_lock_irqsave(&rchp->comp_handler_lock, flag); - (*rchp->ibcq.comp_handler)(&rchp->ibcq, - rchp->ibcq.cq_context); - spin_unlock_irqrestore(&rchp->comp_handler_lock, flag); + if (t4_clear_cq_armed(&rchp->cq)) { + spin_lock_irqsave(&rchp->comp_handler_lock, flag); + (*rchp->ibcq.comp_handler)(&rchp->ibcq, + rchp->ibcq.cq_context); + spin_unlock_irqrestore(&rchp->comp_handler_lock, flag); + } +} + +static void complete_rq_drain_wrs(struct c4iw_qp *qhp, struct ib_recv_wr *wr) +{ + while (wr) { + complete_rq_drain_wr(qhp, wr); + wr = wr->next; + } } int c4iw_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, @@ -875,7 +941,7 @@ int c4iw_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, */ if (qhp->wq.flushed) { spin_unlock_irqrestore(&qhp->lock, flag); - complete_sq_drain_wr(qhp, wr); + err = complete_sq_drain_wrs(qhp, wr, bad_wr); return err; } num_wrs = t4_sq_avail(&qhp->wq); @@ -1024,7 +1090,7 @@ int c4iw_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, */ if (qhp->wq.flushed) { spin_unlock_irqrestore(&qhp->lock, flag); - complete_rq_drain_wr(qhp, wr); + complete_rq_drain_wrs(qhp, wr); return err; } num_wrs = t4_rq_avail(&qhp->wq); @@ -1267,48 +1333,51 @@ static void __flush_qp(struct c4iw_qp *qhp, struct c4iw_cq *rchp, pr_debug("%s qhp %p rchp %p schp %p\n", __func__, qhp, rchp, schp); - /* locking hierarchy: cq lock first, then qp lock. */ + /* locking hierarchy: cqs lock first, then qp lock. */ spin_lock_irqsave(&rchp->lock, flag); + if (schp != rchp) + spin_lock(&schp->lock); spin_lock(&qhp->lock); if (qhp->wq.flushed) { spin_unlock(&qhp->lock); + if (schp != rchp) + spin_unlock(&schp->lock); spin_unlock_irqrestore(&rchp->lock, flag); return; } qhp->wq.flushed = 1; + t4_set_wq_in_error(&qhp->wq); c4iw_flush_hw_cq(rchp); c4iw_count_rcqes(&rchp->cq, &qhp->wq, &count); rq_flushed = c4iw_flush_rq(&qhp->wq, &rchp->cq, count); - spin_unlock(&qhp->lock); - spin_unlock_irqrestore(&rchp->lock, flag); - /* locking hierarchy: cq lock first, then qp lock. */ - spin_lock_irqsave(&schp->lock, flag); - spin_lock(&qhp->lock); if (schp != rchp) c4iw_flush_hw_cq(schp); sq_flushed = c4iw_flush_sq(qhp); + spin_unlock(&qhp->lock); - spin_unlock_irqrestore(&schp->lock, flag); + if (schp != rchp) + spin_unlock(&schp->lock); + spin_unlock_irqrestore(&rchp->lock, flag); if (schp == rchp) { - if (t4_clear_cq_armed(&rchp->cq) && - (rq_flushed || sq_flushed)) { + if ((rq_flushed || sq_flushed) && + t4_clear_cq_armed(&rchp->cq)) { spin_lock_irqsave(&rchp->comp_handler_lock, flag); (*rchp->ibcq.comp_handler)(&rchp->ibcq, rchp->ibcq.cq_context); spin_unlock_irqrestore(&rchp->comp_handler_lock, flag); } } else { - if (t4_clear_cq_armed(&rchp->cq) && rq_flushed) { + if (rq_flushed && t4_clear_cq_armed(&rchp->cq)) { spin_lock_irqsave(&rchp->comp_handler_lock, flag); (*rchp->ibcq.comp_handler)(&rchp->ibcq, rchp->ibcq.cq_context); spin_unlock_irqrestore(&rchp->comp_handler_lock, flag); } - if (t4_clear_cq_armed(&schp->cq) && sq_flushed) { + if (sq_flushed && t4_clear_cq_armed(&schp->cq)) { spin_lock_irqsave(&schp->comp_handler_lock, flag); (*schp->ibcq.comp_handler)(&schp->ibcq, schp->ibcq.cq_context); @@ -1325,8 +1394,8 @@ static void flush_qp(struct c4iw_qp *qhp) rchp = to_c4iw_cq(qhp->ibqp.recv_cq); schp = to_c4iw_cq(qhp->ibqp.send_cq); - t4_set_wq_in_error(&qhp->wq); if (qhp->ibqp.uobject) { + t4_set_wq_in_error(&qhp->wq); t4_set_cq_in_error(&rchp->cq); spin_lock_irqsave(&rchp->comp_handler_lock, flag); (*rchp->ibcq.comp_handler)(&rchp->ibcq, rchp->ibcq.cq_context); diff --git a/drivers/infiniband/hw/cxgb4/t4.h b/drivers/infiniband/hw/cxgb4/t4.h index bcb80ca67d3d..80b390e861dc 100644 --- a/drivers/infiniband/hw/cxgb4/t4.h +++ b/drivers/infiniband/hw/cxgb4/t4.h @@ -197,6 +197,11 @@ struct t4_cqe { #define CQE_SWCQE_G(x) ((((x) >> CQE_SWCQE_S)) & CQE_SWCQE_M) #define CQE_SWCQE_V(x) ((x)<<CQE_SWCQE_S) +#define CQE_DRAIN_S 10 +#define CQE_DRAIN_M 0x1 +#define CQE_DRAIN_G(x) ((((x) >> CQE_DRAIN_S)) & CQE_DRAIN_M) +#define CQE_DRAIN_V(x) ((x)<<CQE_DRAIN_S) + #define CQE_STATUS_S 5 #define CQE_STATUS_M 0x1F #define CQE_STATUS_G(x) ((((x) >> CQE_STATUS_S)) & CQE_STATUS_M) @@ -213,6 +218,7 @@ struct t4_cqe { #define CQE_OPCODE_V(x) ((x)<<CQE_OPCODE_S) #define SW_CQE(x) (CQE_SWCQE_G(be32_to_cpu((x)->header))) +#define DRAIN_CQE(x) (CQE_DRAIN_G(be32_to_cpu((x)->header))) #define CQE_QPID(x) (CQE_QPID_G(be32_to_cpu((x)->header))) #define CQE_TYPE(x) (CQE_TYPE_G(be32_to_cpu((x)->header))) #define SQ_TYPE(x) (CQE_TYPE((x))) diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index 95178b4e3565..ee578fa713c2 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -1000,8 +1000,7 @@ static int srpt_init_ch_qp(struct srpt_rdma_ch *ch, struct ib_qp *qp) return -ENOMEM; attr->qp_state = IB_QPS_INIT; - attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_READ | - IB_ACCESS_REMOTE_WRITE; + attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE; attr->port_num = ch->sport->port; attr->pkey_index = 0; @@ -1992,7 +1991,7 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id, goto destroy_ib; } - guid = (__be16 *)&param->primary_path->sgid.global.interface_id; + guid = (__be16 *)&param->primary_path->dgid.global.interface_id; snprintf(ch->ini_guid, sizeof(ch->ini_guid), "%04x:%04x:%04x:%04x", be16_to_cpu(guid[0]), be16_to_cpu(guid[1]), be16_to_cpu(guid[2]), be16_to_cpu(guid[3])); diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index 8e3adcb46851..6d416fdc25cb 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -1611,7 +1611,8 @@ static unsigned long __scan(struct dm_bufio_client *c, unsigned long nr_to_scan, int l; struct dm_buffer *b, *tmp; unsigned long freed = 0; - unsigned long count = nr_to_scan; + unsigned long count = c->n_buffers[LIST_CLEAN] + + c->n_buffers[LIST_DIRTY]; unsigned long retain_target = get_retain_buffers(c); for (l = 0; l < LIST_SIZE; l++) { @@ -1647,8 +1648,11 @@ static unsigned long dm_bufio_shrink_count(struct shrinker *shrink, struct shrink_control *sc) { struct dm_bufio_client *c = container_of(shrink, struct dm_bufio_client, shrinker); + unsigned long count = ACCESS_ONCE(c->n_buffers[LIST_CLEAN]) + + ACCESS_ONCE(c->n_buffers[LIST_DIRTY]); + unsigned long retain_target = get_retain_buffers(c); - return ACCESS_ONCE(c->n_buffers[LIST_CLEAN]) + ACCESS_ONCE(c->n_buffers[LIST_DIRTY]); + return (count < retain_target) ? 0 : (count - retain_target); } /* diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c index fcf7235d5742..157e1d9e7725 100644 --- a/drivers/mmc/host/renesas_sdhi_core.c +++ b/drivers/mmc/host/renesas_sdhi_core.c @@ -24,6 +24,7 @@ #include <linux/kernel.h> #include <linux/clk.h> #include <linux/slab.h> +#include <linux/module.h> #include <linux/of_device.h> #include <linux/platform_device.h> #include <linux/mmc/host.h> @@ -667,3 +668,5 @@ int renesas_sdhi_remove(struct platform_device *pdev) return 0; } EXPORT_SYMBOL_GPL(renesas_sdhi_remove); + +MODULE_LICENSE("GPL v2"); diff --git a/drivers/mux/core.c b/drivers/mux/core.c index 2260063b0ea8..6e5cf9d9cd99 100644 --- a/drivers/mux/core.c +++ b/drivers/mux/core.c @@ -413,6 +413,7 @@ static int of_dev_node_match(struct device *dev, const void *data) return dev->of_node == data; } +/* Note this function returns a reference to the mux_chip dev. */ static struct mux_chip *of_find_mux_chip_by_node(struct device_node *np) { struct device *dev; @@ -466,6 +467,7 @@ struct mux_control *mux_control_get(struct device *dev, const char *mux_name) (!args.args_count && (mux_chip->controllers > 1))) { dev_err(dev, "%pOF: wrong #mux-control-cells for %pOF\n", np, args.np); + put_device(&mux_chip->dev); return ERR_PTR(-EINVAL); } @@ -476,10 +478,10 @@ struct mux_control *mux_control_get(struct device *dev, const char *mux_name) if (controller >= mux_chip->controllers) { dev_err(dev, "%pOF: bad mux controller %u specified in %pOF\n", np, controller, args.np); + put_device(&mux_chip->dev); return ERR_PTR(-EINVAL); } - get_device(&mux_chip->dev); return &mux_chip->mux[controller]; } EXPORT_SYMBOL_GPL(mux_control_get); diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c index 68ac3e88a8ce..8bf80ad9dc44 100644 --- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -449,7 +449,7 @@ static int gs_usb_set_bittiming(struct net_device *netdev) dev_err(netdev->dev.parent, "Couldn't set bittimings (err=%d)", rc); - return rc; + return (rc > 0) ? 0 : rc; } static void gs_usb_xmit_callback(struct urb *urb) diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c index 8404e8852a0f..b4c4a2c76437 100644 --- a/drivers/net/can/vxcan.c +++ b/drivers/net/can/vxcan.c @@ -194,7 +194,7 @@ static int vxcan_newlink(struct net *net, struct net_device *dev, tbp = peer_tb; } - if (tbp[IFLA_IFNAME]) { + if (ifmp && tbp[IFLA_IFNAME]) { nla_strlcpy(ifname, tbp[IFLA_IFNAME], IFNAMSIZ); name_assign_type = NET_NAME_USER; } else { diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index faf7cdc97ebf..311539c6625f 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3458,6 +3458,10 @@ fec_probe(struct platform_device *pdev) goto failed_regulator; } } else { + if (PTR_ERR(fep->reg_phy) == -EPROBE_DEFER) { + ret = -EPROBE_DEFER; + goto failed_regulator; + } fep->reg_phy = NULL; } @@ -3539,8 +3543,9 @@ fec_probe(struct platform_device *pdev) failed_clk: if (of_phy_is_fixed_link(np)) of_phy_deregister_fixed_link(np); -failed_phy: of_node_put(phy_node); +failed_phy: + dev_id--; failed_ioremap: free_netdev(ndev); diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c index d6d4ed7acf03..31277d3bb7dc 100644 --- a/drivers/net/ethernet/intel/e1000e/ich8lan.c +++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c @@ -1367,6 +1367,9 @@ static s32 e1000_disable_ulp_lpt_lp(struct e1000_hw *hw, bool force) * Checks to see of the link status of the hardware has changed. If a * change in link status has been detected, then we read the PHY registers * to get the current speed/duplex if link exists. + * + * Returns a negative error code (-E1000_ERR_*) or 0 (link down) or 1 (link + * up). **/ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) { @@ -1382,7 +1385,7 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) * Change or Rx Sequence Error interrupt. */ if (!mac->get_link_status) - return 0; + return 1; /* First we want to see if the MII Status Register reports * link. If so, then we want to get the current speed/duplex @@ -1613,10 +1616,12 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) * different link partner. */ ret_val = e1000e_config_fc_after_link_up(hw); - if (ret_val) + if (ret_val) { e_dbg("Error configuring flow control\n"); + return ret_val; + } - return ret_val; + return 1; } static s32 e1000_get_variants_ich8lan(struct e1000_adapter *adapter) diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c index 3ead7439821c..99bd6e88ebc7 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c @@ -4235,7 +4235,10 @@ static int mlxsw_sp_netdevice_port_upper_event(struct net_device *lower_dev, return -EINVAL; if (!info->linking) break; - if (netdev_has_any_upper_dev(upper_dev)) + if (netdev_has_any_upper_dev(upper_dev) && + (!netif_is_bridge_master(upper_dev) || + !mlxsw_sp_bridge_device_is_offloaded(mlxsw_sp, + upper_dev))) return -EINVAL; if (netif_is_lag_master(upper_dev) && !mlxsw_sp_master_lag_check(mlxsw_sp, upper_dev, @@ -4347,6 +4350,7 @@ static int mlxsw_sp_netdevice_port_vlan_event(struct net_device *vlan_dev, u16 vid) { struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev); + struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp; struct netdev_notifier_changeupper_info *info = ptr; struct net_device *upper_dev; int err = 0; @@ -4358,7 +4362,10 @@ static int mlxsw_sp_netdevice_port_vlan_event(struct net_device *vlan_dev, return -EINVAL; if (!info->linking) break; - if (netdev_has_any_upper_dev(upper_dev)) + if (netdev_has_any_upper_dev(upper_dev) && + (!netif_is_bridge_master(upper_dev) || + !mlxsw_sp_bridge_device_is_offloaded(mlxsw_sp, + upper_dev))) return -EINVAL; break; case NETDEV_CHANGEUPPER: diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h index 84ce83acdc19..88892d47acae 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h @@ -326,6 +326,8 @@ int mlxsw_sp_port_bridge_join(struct mlxsw_sp_port *mlxsw_sp_port, void mlxsw_sp_port_bridge_leave(struct mlxsw_sp_port *mlxsw_sp_port, struct net_device *brport_dev, struct net_device *br_dev); +bool mlxsw_sp_bridge_device_is_offloaded(const struct mlxsw_sp *mlxsw_sp, + const struct net_device *br_dev); /* spectrum.c */ int mlxsw_sp_port_ets_set(struct mlxsw_sp_port *mlxsw_sp_port, diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c index 5189022a1c8c..c23cc51bb5a5 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c @@ -2536,7 +2536,7 @@ static void __mlxsw_sp_nexthop_neigh_update(struct mlxsw_sp_nexthop *nh, { if (!removing) nh->should_offload = 1; - else if (nh->offloaded) + else nh->should_offload = 0; nh->update = 1; } diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c index d39ffbfcc436..f5863e5bec81 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c @@ -134,6 +134,12 @@ mlxsw_sp_bridge_device_find(const struct mlxsw_sp_bridge *bridge, return NULL; } +bool mlxsw_sp_bridge_device_is_offloaded(const struct mlxsw_sp *mlxsw_sp, + const struct net_device *br_dev) +{ + return !!mlxsw_sp_bridge_device_find(mlxsw_sp->bridge, br_dev); +} + static struct mlxsw_sp_bridge_device * mlxsw_sp_bridge_device_create(struct mlxsw_sp_bridge *bridge, struct net_device *br_dev) diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c index d2e88a30f57b..db31963c5d9d 100644 --- a/drivers/net/ethernet/renesas/sh_eth.c +++ b/drivers/net/ethernet/renesas/sh_eth.c @@ -3212,18 +3212,37 @@ static int sh_eth_drv_probe(struct platform_device *pdev) /* ioremap the TSU registers */ if (mdp->cd->tsu) { struct resource *rtsu; + rtsu = platform_get_resource(pdev, IORESOURCE_MEM, 1); - mdp->tsu_addr = devm_ioremap_resource(&pdev->dev, rtsu); - if (IS_ERR(mdp->tsu_addr)) { - ret = PTR_ERR(mdp->tsu_addr); + if (!rtsu) { + dev_err(&pdev->dev, "no TSU resource\n"); + ret = -ENODEV; + goto out_release; + } + /* We can only request the TSU region for the first port + * of the two sharing this TSU for the probe to succeed... + */ + if (devno % 2 == 0 && + !devm_request_mem_region(&pdev->dev, rtsu->start, + resource_size(rtsu), + dev_name(&pdev->dev))) { + dev_err(&pdev->dev, "can't request TSU resource.\n"); + ret = -EBUSY; + goto out_release; + } + mdp->tsu_addr = devm_ioremap(&pdev->dev, rtsu->start, + resource_size(rtsu)); + if (!mdp->tsu_addr) { + dev_err(&pdev->dev, "TSU region ioremap() failed.\n"); + ret = -ENOMEM; goto out_release; } mdp->port = devno % 2; ndev->features = NETIF_F_HW_VLAN_CTAG_FILTER; } - /* initialize first or needed device */ - if (!devno || pd->needs_init) { + /* Need to init only the first port of the two sharing a TSU */ + if (devno % 2 == 0) { if (mdp->cd->chip_reset) mdp->cd->chip_reset(ndev); diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 28c4d6fa096c..0ad12c81a9e4 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -364,9 +364,15 @@ static void stmmac_eee_ctrl_timer(unsigned long arg) bool stmmac_eee_init(struct stmmac_priv *priv) { struct net_device *ndev = priv->dev; + int interface = priv->plat->interface; unsigned long flags; bool ret = false; + if ((interface != PHY_INTERFACE_MODE_MII) && + (interface != PHY_INTERFACE_MODE_GMII) && + !phy_interface_mode_is_rgmii(interface)) + goto out; + /* Using PCS we cannot dial with the phy registers at this stage * so we do not support extra feature like EEE. */ diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index 4b377b978a0b..cb85307f125b 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -1428,9 +1428,8 @@ static void phylink_sfp_link_down(void *upstream) WARN_ON(!lockdep_rtnl_is_held()); set_bit(PHYLINK_DISABLE_LINK, &pl->phylink_disable_state); + queue_work(system_power_efficient_wq, &pl->resolve); flush_work(&pl->resolve); - - netif_carrier_off(pl->netdev); } static void phylink_sfp_link_up(void *upstream) diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c index 5cb5384697ea..7ae815bee52d 100644 --- a/drivers/net/phy/sfp-bus.c +++ b/drivers/net/phy/sfp-bus.c @@ -359,7 +359,8 @@ EXPORT_SYMBOL_GPL(sfp_register_upstream); void sfp_unregister_upstream(struct sfp_bus *bus) { rtnl_lock(); - sfp_unregister_bus(bus); + if (bus->sfp) + sfp_unregister_bus(bus); bus->upstream = NULL; bus->netdev = NULL; rtnl_unlock(); @@ -464,7 +465,8 @@ EXPORT_SYMBOL_GPL(sfp_register_socket); void sfp_unregister_socket(struct sfp_bus *bus) { rtnl_lock(); - sfp_unregister_bus(bus); + if (bus->netdev) + sfp_unregister_bus(bus); bus->sfp_dev = NULL; bus->sfp = NULL; bus->socket_ops = NULL; diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h index 4fb7647995c3..9875ab5ce18c 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h +++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h @@ -666,11 +666,15 @@ static inline u8 iwl_pcie_get_cmd_index(struct iwl_txq *q, u32 index) return index & (q->n_window - 1); } -static inline void *iwl_pcie_get_tfd(struct iwl_trans_pcie *trans_pcie, +static inline void *iwl_pcie_get_tfd(struct iwl_trans *trans, struct iwl_txq *txq, int idx) { - return txq->tfds + trans_pcie->tfd_size * iwl_pcie_get_cmd_index(txq, - idx); + struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); + + if (trans->cfg->use_tfh) + idx = iwl_pcie_get_cmd_index(txq, idx); + + return txq->tfds + trans_pcie->tfd_size * idx; } static inline void iwl_enable_rfkill_int(struct iwl_trans *trans) diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c index d74613fcb756..6f45c8148b27 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c @@ -171,8 +171,6 @@ static void iwl_pcie_gen2_tfd_unmap(struct iwl_trans *trans, static void iwl_pcie_gen2_free_tfd(struct iwl_trans *trans, struct iwl_txq *txq) { - struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); - /* rd_ptr is bounded by TFD_QUEUE_SIZE_MAX and * idx is bounded by n_window */ @@ -181,7 +179,7 @@ static void iwl_pcie_gen2_free_tfd(struct iwl_trans *trans, struct iwl_txq *txq) lockdep_assert_held(&txq->lock); iwl_pcie_gen2_tfd_unmap(trans, &txq->entries[idx].meta, - iwl_pcie_get_tfd(trans_pcie, txq, idx)); + iwl_pcie_get_tfd(trans, txq, idx)); /* free SKB */ if (txq->entries) { @@ -367,11 +365,9 @@ struct iwl_tfh_tfd *iwl_pcie_gen2_build_tfd(struct iwl_trans *trans, struct sk_buff *skb, struct iwl_cmd_meta *out_meta) { - struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data; int idx = iwl_pcie_get_cmd_index(txq, txq->write_ptr); - struct iwl_tfh_tfd *tfd = - iwl_pcie_get_tfd(trans_pcie, txq, idx); + struct iwl_tfh_tfd *tfd = iwl_pcie_get_tfd(trans, txq, idx); dma_addr_t tb_phys; bool amsdu; int i, len, tb1_len, tb2_len, hdr_len; @@ -568,8 +564,7 @@ static int iwl_pcie_gen2_enqueue_hcmd(struct iwl_trans *trans, u8 group_id = iwl_cmd_groupid(cmd->id); const u8 *cmddata[IWL_MAX_CMD_TBS_PER_TFD]; u16 cmdlen[IWL_MAX_CMD_TBS_PER_TFD]; - struct iwl_tfh_tfd *tfd = - iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr); + struct iwl_tfh_tfd *tfd = iwl_pcie_get_tfd(trans, txq, txq->write_ptr); memset(tfd, 0, sizeof(*tfd)); diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c index c645d10d3707..4704137a26e0 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c @@ -373,7 +373,7 @@ static void iwl_pcie_tfd_unmap(struct iwl_trans *trans, { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); int i, num_tbs; - void *tfd = iwl_pcie_get_tfd(trans_pcie, txq, index); + void *tfd = iwl_pcie_get_tfd(trans, txq, index); /* Sanity check on number of chunks */ num_tbs = iwl_pcie_tfd_get_num_tbs(trans, tfd); @@ -1999,7 +1999,7 @@ static int iwl_fill_data_tbs(struct iwl_trans *trans, struct sk_buff *skb, } trace_iwlwifi_dev_tx(trans->dev, skb, - iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr), + iwl_pcie_get_tfd(trans, txq, txq->write_ptr), trans_pcie->tfd_size, &dev_cmd->hdr, IWL_FIRST_TB_SIZE + tb1_len, hdr_len); @@ -2073,7 +2073,7 @@ static int iwl_fill_data_tbs_amsdu(struct iwl_trans *trans, struct sk_buff *skb, IEEE80211_CCMP_HDR_LEN : 0; trace_iwlwifi_dev_tx(trans->dev, skb, - iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr), + iwl_pcie_get_tfd(trans, txq, txq->write_ptr), trans_pcie->tfd_size, &dev_cmd->hdr, IWL_FIRST_TB_SIZE + tb1_len, 0); @@ -2406,7 +2406,7 @@ int iwl_trans_pcie_tx(struct iwl_trans *trans, struct sk_buff *skb, memcpy(&txq->first_tb_bufs[txq->write_ptr], &dev_cmd->hdr, IWL_FIRST_TB_SIZE); - tfd = iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr); + tfd = iwl_pcie_get_tfd(trans, txq, txq->write_ptr); /* Set up entry for this TFD in Tx byte-count array */ iwl_pcie_txq_update_byte_cnt_tbl(trans, txq, le16_to_cpu(tx_cmd->len), iwl_pcie_tfd_get_num_tbs(trans, tfd)); diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c index 0765b1797d4c..7f8fa42a1084 100644 --- a/drivers/platform/x86/wmi.c +++ b/drivers/platform/x86/wmi.c @@ -1268,5 +1268,5 @@ static void __exit acpi_wmi_exit(void) bus_unregister(&wmi_bus_type); } -subsys_initcall(acpi_wmi_init); +subsys_initcall_sync(acpi_wmi_init); module_exit(acpi_wmi_exit); diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index 0f695df14c9d..372ce9913e6d 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -765,10 +765,12 @@ static long ashmem_ioctl(struct file *file, unsigned int cmd, unsigned long arg) break; case ASHMEM_SET_SIZE: ret = -EINVAL; + mutex_lock(&ashmem_mutex); if (!asma->file) { ret = 0; asma->size = (size_t)arg; } + mutex_unlock(&ashmem_mutex); break; case ASHMEM_GET_SIZE: ret = asma->size; diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c index def1b05ffca0..284bd1a7b570 100644 --- a/drivers/usb/gadget/udc/core.c +++ b/drivers/usb/gadget/udc/core.c @@ -1158,11 +1158,7 @@ int usb_add_gadget_udc_release(struct device *parent, struct usb_gadget *gadget, udc = kzalloc(sizeof(*udc), GFP_KERNEL); if (!udc) - goto err1; - - ret = device_add(&gadget->dev); - if (ret) - goto err2; + goto err_put_gadget; device_initialize(&udc->dev); udc->dev.release = usb_udc_release; @@ -1171,7 +1167,11 @@ int usb_add_gadget_udc_release(struct device *parent, struct usb_gadget *gadget, udc->dev.parent = parent; ret = dev_set_name(&udc->dev, "%s", kobject_name(&parent->kobj)); if (ret) - goto err3; + goto err_put_udc; + + ret = device_add(&gadget->dev); + if (ret) + goto err_put_udc; udc->gadget = gadget; gadget->udc = udc; @@ -1181,7 +1181,7 @@ int usb_add_gadget_udc_release(struct device *parent, struct usb_gadget *gadget, ret = device_add(&udc->dev); if (ret) - goto err4; + goto err_unlist_udc; usb_gadget_set_state(gadget, USB_STATE_NOTATTACHED); udc->vbus = true; @@ -1189,27 +1189,25 @@ int usb_add_gadget_udc_release(struct device *parent, struct usb_gadget *gadget, /* pick up one of pending gadget drivers */ ret = check_pending_gadget_drivers(udc); if (ret) - goto err5; + goto err_del_udc; mutex_unlock(&udc_lock); return 0; -err5: + err_del_udc: device_del(&udc->dev); -err4: + err_unlist_udc: list_del(&udc->list); mutex_unlock(&udc_lock); -err3: - put_device(&udc->dev); device_del(&gadget->dev); -err2: - kfree(udc); + err_put_udc: + put_device(&udc->dev); -err1: + err_put_gadget: put_device(&gadget->dev); return ret; } diff --git a/drivers/usb/misc/usb3503.c b/drivers/usb/misc/usb3503.c index 8e7737d7ac0a..03be5d574f23 100644 --- a/drivers/usb/misc/usb3503.c +++ b/drivers/usb/misc/usb3503.c @@ -292,6 +292,8 @@ static int usb3503_probe(struct usb3503 *hub) if (gpio_is_valid(hub->gpio_reset)) { err = devm_gpio_request_one(dev, hub->gpio_reset, GPIOF_OUT_INIT_LOW, "usb3503 reset"); + /* Datasheet defines a hardware reset to be at least 100us */ + usleep_range(100, 10000); if (err) { dev_err(dev, "unable to request GPIO %d as reset pin (%d)\n", diff --git a/drivers/usb/mon/mon_bin.c b/drivers/usb/mon/mon_bin.c index f6ae753ab99b..f932f40302df 100644 --- a/drivers/usb/mon/mon_bin.c +++ b/drivers/usb/mon/mon_bin.c @@ -1004,7 +1004,9 @@ static long mon_bin_ioctl(struct file *file, unsigned int cmd, unsigned long arg break; case MON_IOCQ_RING_SIZE: + mutex_lock(&rp->fetch_lock); ret = rp->b_size; + mutex_unlock(&rp->fetch_lock); break; case MON_IOCT_RING_SIZE: @@ -1231,12 +1233,16 @@ static int mon_bin_vma_fault(struct vm_fault *vmf) unsigned long offset, chunk_idx; struct page *pageptr; + mutex_lock(&rp->fetch_lock); offset = vmf->pgoff << PAGE_SHIFT; - if (offset >= rp->b_size) + if (offset >= rp->b_size) { + mutex_unlock(&rp->fetch_lock); return VM_FAULT_SIGBUS; + } chunk_idx = offset / CHUNK_SIZE; pageptr = rp->b_vec[chunk_idx].pg; get_page(pageptr); + mutex_unlock(&rp->fetch_lock); vmf->page = pageptr; return 0; } diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c index 412f812522ee..aed182d24d23 100644 --- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -127,6 +127,7 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x10C4, 0x8470) }, /* Juniper Networks BX Series System Console */ { USB_DEVICE(0x10C4, 0x8477) }, /* Balluff RFID */ { USB_DEVICE(0x10C4, 0x84B6) }, /* Starizona Hyperion */ + { USB_DEVICE(0x10C4, 0x85A7) }, /* LifeScan OneTouch Verio IQ */ { USB_DEVICE(0x10C4, 0x85EA) }, /* AC-Services IBUS-IF */ { USB_DEVICE(0x10C4, 0x85EB) }, /* AC-Services CIS-IBUS */ { USB_DEVICE(0x10C4, 0x85F8) }, /* Virtenio Preon32 */ @@ -177,6 +178,7 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x1843, 0x0200) }, /* Vaisala USB Instrument Cable */ { USB_DEVICE(0x18EF, 0xE00F) }, /* ELV USB-I2C-Interface */ { USB_DEVICE(0x18EF, 0xE025) }, /* ELV Marble Sound Board 1 */ + { USB_DEVICE(0x18EF, 0xE030) }, /* ELV ALC 8xxx Battery Charger */ { USB_DEVICE(0x18EF, 0xE032) }, /* ELV TFD500 Data Logger */ { USB_DEVICE(0x1901, 0x0190) }, /* GE B850 CP2105 Recorder interface */ { USB_DEVICE(0x1901, 0x0193) }, /* GE B650 CP2104 PMC interface */ diff --git a/drivers/usb/storage/unusual_uas.h b/drivers/usb/storage/unusual_uas.h index 9f356f7cf7d5..719ec68ae309 100644 --- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -156,6 +156,13 @@ UNUSUAL_DEV(0x2109, 0x0711, 0x0000, 0x9999, USB_SC_DEVICE, USB_PR_DEVICE, NULL, US_FL_NO_ATA_1X), +/* Reported-by: Icenowy Zheng <icenowy(a)aosc.io> */ +UNUSUAL_DEV(0x2537, 0x1068, 0x0000, 0x9999, + "Norelsys", + "NS1068X", + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_IGNORE_UAS), + /* Reported-by: Takeo Nakayama <javhera(a)gmx.com> */ UNUSUAL_DEV(0x357d, 0x7788, 0x0000, 0x9999, "JMicron", diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c index 17b599b923f3..7f0d22131121 100644 --- a/drivers/usb/usbip/usbip_common.c +++ b/drivers/usb/usbip/usbip_common.c @@ -105,7 +105,7 @@ static void usbip_dump_usb_device(struct usb_device *udev) dev_dbg(dev, " devnum(%d) devpath(%s) usb speed(%s)", udev->devnum, udev->devpath, usb_speed_string(udev->speed)); - pr_debug("tt %p, ttport %d\n", udev->tt, udev->ttport); + pr_debug("tt hub ttport %d\n", udev->ttport); dev_dbg(dev, " "); for (i = 0; i < 16; i++) @@ -138,12 +138,8 @@ static void usbip_dump_usb_device(struct usb_device *udev) } pr_debug("\n"); - dev_dbg(dev, "parent %p, bus %p\n", udev->parent, udev->bus); - - dev_dbg(dev, - "descriptor %p, config %p, actconfig %p, rawdescriptors %p\n", - &udev->descriptor, udev->config, - udev->actconfig, udev->rawdescriptors); + dev_dbg(dev, "parent %s, bus %s\n", dev_name(&udev->parent->dev), + udev->bus->bus_name); dev_dbg(dev, "have_langid %d, string_langid %d\n", udev->have_langid, udev->string_langid); @@ -251,9 +247,6 @@ void usbip_dump_urb(struct urb *urb) dev = &urb->dev->dev; - dev_dbg(dev, " urb :%p\n", urb); - dev_dbg(dev, " dev :%p\n", urb->dev); - usbip_dump_usb_device(urb->dev); dev_dbg(dev, " pipe :%08x ", urb->pipe); @@ -262,11 +255,9 @@ void usbip_dump_urb(struct urb *urb) dev_dbg(dev, " status :%d\n", urb->status); dev_dbg(dev, " transfer_flags :%08X\n", urb->transfer_flags); - dev_dbg(dev, " transfer_buffer :%p\n", urb->transfer_buffer); dev_dbg(dev, " transfer_buffer_length:%d\n", urb->transfer_buffer_length); dev_dbg(dev, " actual_length :%d\n", urb->actual_length); - dev_dbg(dev, " setup_packet :%p\n", urb->setup_packet); if (urb->setup_packet && usb_pipetype(urb->pipe) == PIPE_CONTROL) usbip_dump_usb_ctrlrequest( @@ -276,8 +267,6 @@ void usbip_dump_urb(struct urb *urb) dev_dbg(dev, " number_of_packets :%d\n", urb->number_of_packets); dev_dbg(dev, " interval :%d\n", urb->interval); dev_dbg(dev, " error_count :%d\n", urb->error_count); - dev_dbg(dev, " context :%p\n", urb->context); - dev_dbg(dev, " complete :%p\n", urb->complete); } EXPORT_SYMBOL_GPL(usbip_dump_urb); diff --git a/drivers/usb/usbip/vudc_rx.c b/drivers/usb/usbip/vudc_rx.c index e429b59f6f8a..d020e72b3122 100644 --- a/drivers/usb/usbip/vudc_rx.c +++ b/drivers/usb/usbip/vudc_rx.c @@ -132,6 +132,25 @@ static int v_recv_cmd_submit(struct vudc *udc, urb_p->new = 1; urb_p->seqnum = pdu->base.seqnum; + if (urb_p->ep->type == USB_ENDPOINT_XFER_ISOC) { + /* validate packet size and number of packets */ + unsigned int maxp, packets, bytes; + + maxp = usb_endpoint_maxp(urb_p->ep->desc); + maxp *= usb_endpoint_maxp_mult(urb_p->ep->desc); + bytes = pdu->u.cmd_submit.transfer_buffer_length; + packets = DIV_ROUND_UP(bytes, maxp); + + if (pdu->u.cmd_submit.number_of_packets < 0 || + pdu->u.cmd_submit.number_of_packets > packets) { + dev_err(&udc->gadget.dev, + "CMD_SUBMIT: isoc invalid num packets %d\n", + pdu->u.cmd_submit.number_of_packets); + ret = -EMSGSIZE; + goto free_urbp; + } + } + ret = alloc_urb_from_cmd(&urb_p->urb, pdu, urb_p->ep->type); if (ret) { usbip_event_add(&udc->ud, VUDC_EVENT_ERROR_MALLOC); diff --git a/drivers/usb/usbip/vudc_tx.c b/drivers/usb/usbip/vudc_tx.c index 234661782fa0..3ab4c86486a7 100644 --- a/drivers/usb/usbip/vudc_tx.c +++ b/drivers/usb/usbip/vudc_tx.c @@ -97,6 +97,13 @@ static int v_send_ret_submit(struct vudc *udc, struct urbp *urb_p) memset(&pdu_header, 0, sizeof(pdu_header)); memset(&msg, 0, sizeof(msg)); + if (urb->actual_length > 0 && !urb->transfer_buffer) { + dev_err(&udc->gadget.dev, + "urb: actual_length %d transfer_buffer null\n", + urb->actual_length); + return -1; + } + if (urb_p->type == USB_ENDPOINT_XFER_ISOC) iovnum = 2 + urb->number_of_packets; else @@ -112,8 +119,8 @@ static int v_send_ret_submit(struct vudc *udc, struct urbp *urb_p) /* 1. setup usbip_header */ setup_ret_submit_pdu(&pdu_header, urb_p); - usbip_dbg_stub_tx("setup txdata seqnum: %d urb: %p\n", - pdu_header.base.seqnum, urb); + usbip_dbg_stub_tx("setup txdata seqnum: %d\n", + pdu_header.base.seqnum); usbip_header_correct_endian(&pdu_header, 1); iov[iovnum].iov_base = &pdu_header; diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f1af7d63d678..0bcf803f20de 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -51,6 +51,7 @@ struct bpf_map { u32 pages; u32 id; int numa_node; + bool unpriv_array; struct user_struct *user; const struct bpf_map_ops *ops; struct work_struct work; @@ -195,6 +196,7 @@ struct bpf_prog_aux { struct bpf_array { struct bpf_map map; u32 elem_size; + u32 index_mask; /* 'ownership' of prog_array is claimed by the first program that * is going to use this map or by the first program which FD is stored * in the map to make sure that all callers and callees have the same diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 938ea8ae0ba4..c816e6f2730c 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -47,6 +47,13 @@ extern void cpu_remove_dev_attr(struct device_attribute *attr); extern int cpu_add_dev_attr_group(struct attribute_group *attrs); extern void cpu_remove_dev_attr_group(struct attribute_group *attrs); +extern ssize_t cpu_show_meltdown(struct device *dev, + struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf); + extern __printf(4, 5) struct device *cpu_device_create(struct device *parent, void *drvdata, const struct attribute_group **groups, diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h index 06097ef30449..b511f6d24b42 100644 --- a/include/linux/crash_core.h +++ b/include/linux/crash_core.h @@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void); vmcoreinfo_append_str("PAGESIZE=%ld\n", value) #define VMCOREINFO_SYMBOL(name) \ vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name) +#define VMCOREINFO_SYMBOL_ARRAY(name) \ + vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name) #define VMCOREINFO_SIZE(name) \ vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \ (unsigned long)sizeof(name)) diff --git a/include/linux/sh_eth.h b/include/linux/sh_eth.h index ff3642d267f7..94081e9a5010 100644 --- a/include/linux/sh_eth.h +++ b/include/linux/sh_eth.h @@ -17,7 +17,6 @@ struct sh_eth_plat_data { unsigned char mac_addr[ETH_ALEN]; unsigned no_ether_link:1; unsigned ether_link_active_low:1; - unsigned needs_init:1; }; #endif diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h index 0477945de1a3..8e1e1dc490fd 100644 --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -955,7 +955,7 @@ void sctp_transport_burst_limited(struct sctp_transport *); void sctp_transport_burst_reset(struct sctp_transport *); unsigned long sctp_transport_timeout(struct sctp_transport *); void sctp_transport_reset(struct sctp_transport *t); -void sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu); +bool sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu); void sctp_transport_immediate_rtx(struct sctp_transport *); void sctp_transport_dst_release(struct sctp_transport *t); void sctp_transport_dst_confirm(struct sctp_transport *t); diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h index e4b0b8e09932..2c735a3e6613 100644 --- a/include/trace/events/kvm.h +++ b/include/trace/events/kvm.h @@ -211,7 +211,7 @@ TRACE_EVENT(kvm_ack_irq, { KVM_TRACE_MMIO_WRITE, "write" } TRACE_EVENT(kvm_mmio, - TP_PROTO(int type, int len, u64 gpa, u64 val), + TP_PROTO(int type, int len, u64 gpa, void *val), TP_ARGS(type, len, gpa, val), TP_STRUCT__entry( @@ -225,7 +225,10 @@ TRACE_EVENT(kvm_mmio, __entry->type = type; __entry->len = len; __entry->gpa = gpa; - __entry->val = val; + __entry->val = 0; + if (val) + memcpy(&__entry->val, val, + min_t(u32, sizeof(__entry->val), len)); ), TP_printk("mmio %s len %u gpa 0x%llx val 0x%llx", diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index e2636737b69b..a4ae1ca44a57 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -50,9 +50,10 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) { bool percpu = attr->map_type == BPF_MAP_TYPE_PERCPU_ARRAY; int numa_node = bpf_map_attr_numa_node(attr); + u32 elem_size, index_mask, max_entries; + bool unpriv = !capable(CAP_SYS_ADMIN); struct bpf_array *array; - u64 array_size; - u32 elem_size; + u64 array_size, mask64; /* check sanity of attributes */ if (attr->max_entries == 0 || attr->key_size != 4 || @@ -68,11 +69,32 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) elem_size = round_up(attr->value_size, 8); + max_entries = attr->max_entries; + + /* On 32 bit archs roundup_pow_of_two() with max_entries that has + * upper most bit set in u32 space is undefined behavior due to + * resulting 1U << 32, so do it manually here in u64 space. + */ + mask64 = fls_long(max_entries - 1); + mask64 = 1ULL << mask64; + mask64 -= 1; + + index_mask = mask64; + if (unpriv) { + /* round up array size to nearest power of 2, + * since cpu will speculate within index_mask limits + */ + max_entries = index_mask + 1; + /* Check for overflows. */ + if (max_entries < attr->max_entries) + return ERR_PTR(-E2BIG); + } + array_size = sizeof(*array); if (percpu) - array_size += (u64) attr->max_entries * sizeof(void *); + array_size += (u64) max_entries * sizeof(void *); else - array_size += (u64) attr->max_entries * elem_size; + array_size += (u64) max_entries * elem_size; /* make sure there is no u32 overflow later in round_up() */ if (array_size >= U32_MAX - PAGE_SIZE) @@ -82,6 +104,8 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) array = bpf_map_area_alloc(array_size, numa_node); if (!array) return ERR_PTR(-ENOMEM); + array->index_mask = index_mask; + array->map.unpriv_array = unpriv; /* copy mandatory map attributes */ array->map.map_type = attr->map_type; @@ -117,12 +141,13 @@ static void *array_map_lookup_elem(struct bpf_map *map, void *key) if (unlikely(index >= array->map.max_entries)) return NULL; - return array->value + array->elem_size * index; + return array->value + array->elem_size * (index & array->index_mask); } /* emit BPF instructions equivalent to C code of array_map_lookup_elem() */ static u32 array_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf) { + struct bpf_array *array = container_of(map, struct bpf_array, map); struct bpf_insn *insn = insn_buf; u32 elem_size = round_up(map->value_size, 8); const int ret = BPF_REG_0; @@ -131,7 +156,12 @@ static u32 array_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf) *insn++ = BPF_ALU64_IMM(BPF_ADD, map_ptr, offsetof(struct bpf_array, value)); *insn++ = BPF_LDX_MEM(BPF_W, ret, index, 0); - *insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 3); + if (map->unpriv_array) { + *insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 4); + *insn++ = BPF_ALU32_IMM(BPF_AND, ret, array->index_mask); + } else { + *insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 3); + } if (is_power_of_2(elem_size)) { *insn++ = BPF_ALU64_IMM(BPF_LSH, ret, ilog2(elem_size)); @@ -153,7 +183,7 @@ static void *percpu_array_map_lookup_elem(struct bpf_map *map, void *key) if (unlikely(index >= array->map.max_entries)) return NULL; - return this_cpu_ptr(array->pptrs[index]); + return this_cpu_ptr(array->pptrs[index & array->index_mask]); } int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value) @@ -173,7 +203,7 @@ int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value) */ size = round_up(map->value_size, 8); rcu_read_lock(); - pptr = array->pptrs[index]; + pptr = array->pptrs[index & array->index_mask]; for_each_possible_cpu(cpu) { bpf_long_memcpy(value + off, per_cpu_ptr(pptr, cpu), size); off += size; @@ -221,10 +251,11 @@ static int array_map_update_elem(struct bpf_map *map, void *key, void *value, return -EEXIST; if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY) - memcpy(this_cpu_ptr(array->pptrs[index]), + memcpy(this_cpu_ptr(array->pptrs[index & array->index_mask]), value, map->value_size); else - memcpy(array->value + array->elem_size * index, + memcpy(array->value + + array->elem_size * (index & array->index_mask), value, map->value_size); return 0; } @@ -258,7 +289,7 @@ int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value, */ size = round_up(map->value_size, 8); rcu_read_lock(); - pptr = array->pptrs[index]; + pptr = array->pptrs[index & array->index_mask]; for_each_possible_cpu(cpu) { bpf_long_memcpy(per_cpu_ptr(pptr, cpu), value + off, size); off += size; @@ -609,6 +640,7 @@ static void *array_of_map_lookup_elem(struct bpf_map *map, void *key) static u32 array_of_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf) { + struct bpf_array *array = container_of(map, struct bpf_array, map); u32 elem_size = round_up(map->value_size, 8); struct bpf_insn *insn = insn_buf; const int ret = BPF_REG_0; @@ -617,7 +649,12 @@ static u32 array_of_map_gen_lookup(struct bpf_map *map, *insn++ = BPF_ALU64_IMM(BPF_ADD, map_ptr, offsetof(struct bpf_array, value)); *insn++ = BPF_LDX_MEM(BPF_W, ret, index, 0); - *insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 5); + if (map->unpriv_array) { + *insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 6); + *insn++ = BPF_ALU32_IMM(BPF_AND, ret, array->index_mask); + } else { + *insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 5); + } if (is_power_of_2(elem_size)) *insn++ = BPF_ALU64_IMM(BPF_LSH, ret, ilog2(elem_size)); else diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index c5ff809e86d0..75a5c3312f46 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1701,6 +1701,13 @@ static int check_call(struct bpf_verifier_env *env, int func_id, int insn_idx) err = check_func_arg(env, BPF_REG_2, fn->arg2_type, &meta); if (err) return err; + if (func_id == BPF_FUNC_tail_call) { + if (meta.map_ptr == NULL) { + verbose("verifier bug\n"); + return -EINVAL; + } + env->insn_aux_data[insn_idx].map_ptr = meta.map_ptr; + } err = check_func_arg(env, BPF_REG_3, fn->arg3_type, &meta); if (err) return err; @@ -2486,6 +2493,11 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) return -EINVAL; } + if (opcode == BPF_ARSH && BPF_CLASS(insn->code) != BPF_ALU64) { + verbose("BPF_ARSH not supported for 32 bit ALU\n"); + return -EINVAL; + } + if ((opcode == BPF_LSH || opcode == BPF_RSH || opcode == BPF_ARSH) && BPF_SRC(insn->code) == BPF_K) { int size = BPF_CLASS(insn->code) == BPF_ALU64 ? 64 : 32; @@ -4315,6 +4327,35 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) */ insn->imm = 0; insn->code = BPF_JMP | BPF_TAIL_CALL; + + /* instead of changing every JIT dealing with tail_call + * emit two extra insns: + * if (index >= max_entries) goto out; + * index &= array->index_mask; + * to avoid out-of-bounds cpu speculation + */ + map_ptr = env->insn_aux_data[i + delta].map_ptr; + if (map_ptr == BPF_MAP_PTR_POISON) { + verbose("tail_call obusing map_ptr\n"); + return -EINVAL; + } + if (!map_ptr->unpriv_array) + continue; + insn_buf[0] = BPF_JMP_IMM(BPF_JGE, BPF_REG_3, + map_ptr->max_entries, 2); + insn_buf[1] = BPF_ALU32_IMM(BPF_AND, BPF_REG_3, + container_of(map_ptr, + struct bpf_array, + map)->index_mask); + insn_buf[2] = *insn; + cnt = 3; + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); + if (!new_prog) + return -ENOMEM; + + delta += cnt - 1; + env->prog = prog = new_prog; + insn = new_prog->insnsi + i + delta; continue; } diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 44857278eb8a..030e4286f14c 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4059,26 +4059,24 @@ static void css_task_iter_advance_css_set(struct css_task_iter *it) static void css_task_iter_advance(struct css_task_iter *it) { - struct list_head *l = it->task_pos; + struct list_head *next; lockdep_assert_held(&css_set_lock); - WARN_ON_ONCE(!l); - repeat: /* * Advance iterator to find next entry. cset->tasks is consumed * first and then ->mg_tasks. After ->mg_tasks, we move onto the * next cset. */ - l = l->next; + next = it->task_pos->next; - if (l == it->tasks_head) - l = it->mg_tasks_head->next; + if (next == it->tasks_head) + next = it->mg_tasks_head->next; - if (l == it->mg_tasks_head) + if (next == it->mg_tasks_head) css_task_iter_advance_css_set(it); else - it->task_pos = l; + it->task_pos = next; /* if PROCS, skip over tasks which aren't group leaders */ if ((it->flags & CSS_TASK_ITER_PROCS) && it->task_pos && diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 6db80fc0810b..2d90996dbe77 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -409,7 +409,7 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_SYMBOL(contig_page_data); #endif #ifdef CONFIG_SPARSEMEM - VMCOREINFO_SYMBOL(mem_section); + VMCOREINFO_SYMBOL_ARRAY(mem_section); VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS); VMCOREINFO_STRUCT_SIZE(mem_section); VMCOREINFO_OFFSET(mem_section, section_mem_map); diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index dd7908743dab..9bcbacba82a8 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -89,7 +89,9 @@ static int membarrier_private_expedited(void) rcu_read_unlock(); } if (!fallback) { + preempt_disable(); smp_call_function_many(tmpmask, ipi_mb, NULL, 1); + preempt_enable(); free_cpumask_var(tmpmask); } cpus_read_unlock(); diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c index 4a72ee4e2ae9..cf2e70003a53 100644 --- a/net/8021q/vlan.c +++ b/net/8021q/vlan.c @@ -111,12 +111,7 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head) vlan_gvrp_uninit_applicant(real_dev); } - /* Take it out of our own structures, but be sure to interlock with - * HW accelerating devices or SW vlan input packet processing if - * VLAN is not 0 (leave it there for 802.1p). - */ - if (vlan_id) - vlan_vid_del(real_dev, vlan->vlan_proto, vlan_id); + vlan_vid_del(real_dev, vlan->vlan_proto, vlan_id); /* Get rid of the vlan's reference to real_dev */ dev_put(real_dev); diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 43ba91c440bc..fc6615d59165 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -3363,9 +3363,10 @@ static int l2cap_parse_conf_req(struct l2cap_chan *chan, void *data, size_t data break; case L2CAP_CONF_EFS: - remote_efs = 1; - if (olen == sizeof(efs)) + if (olen == sizeof(efs)) { + remote_efs = 1; memcpy(&efs, (void *) val, olen); + } break; case L2CAP_CONF_EWS: @@ -3584,16 +3585,17 @@ static int l2cap_parse_conf_rsp(struct l2cap_chan *chan, void *rsp, int len, break; case L2CAP_CONF_EFS: - if (olen == sizeof(efs)) + if (olen == sizeof(efs)) { memcpy(&efs, (void *)val, olen); - if (chan->local_stype != L2CAP_SERV_NOTRAFIC && - efs.stype != L2CAP_SERV_NOTRAFIC && - efs.stype != chan->local_stype) - return -ECONNREFUSED; + if (chan->local_stype != L2CAP_SERV_NOTRAFIC && + efs.stype != L2CAP_SERV_NOTRAFIC && + efs.stype != chan->local_stype) + return -ECONNREFUSED; - l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs), - (unsigned long) &efs, endptr - ptr); + l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs), + (unsigned long) &efs, endptr - ptr); + } break; case L2CAP_CONF_FCS: diff --git a/net/core/ethtool.c b/net/core/ethtool.c index 9a9a3d77e327..d374a904f1b1 100644 --- a/net/core/ethtool.c +++ b/net/core/ethtool.c @@ -754,15 +754,6 @@ static int ethtool_set_link_ksettings(struct net_device *dev, return dev->ethtool_ops->set_link_ksettings(dev, &link_ksettings); } -static void -warn_incomplete_ethtool_legacy_settings_conversion(const char *details) -{ - char name[sizeof(current->comm)]; - - pr_info_once("warning: `%s' uses legacy ethtool link settings API, %s\n", - get_task_comm(name, current), details); -} - /* Query device for its ethtool_cmd settings. * * Backward compatibility note: for compatibility with legacy ethtool, @@ -789,10 +780,8 @@ static int ethtool_get_settings(struct net_device *dev, void __user *useraddr) &link_ksettings); if (err < 0) return err; - if (!convert_link_ksettings_to_legacy_settings(&cmd, - &link_ksettings)) - warn_incomplete_ethtool_legacy_settings_conversion( - "link modes are only partially reported"); + convert_link_ksettings_to_legacy_settings(&cmd, + &link_ksettings); /* send a sensible cmd tag back to user */ cmd.cmd = ETHTOOL_GSET; diff --git a/net/core/sock_diag.c b/net/core/sock_diag.c index 217f4e3b82f6..146b50e30659 100644 --- a/net/core/sock_diag.c +++ b/net/core/sock_diag.c @@ -288,7 +288,7 @@ static int sock_diag_bind(struct net *net, int group) case SKNLGRP_INET6_UDP_DESTROY: if (!sock_diag_handlers[AF_INET6]) request_module("net-pf-%d-proto-%d-type-%d", PF_NETLINK, - NETLINK_SOCK_DIAG, AF_INET); + NETLINK_SOCK_DIAG, AF_INET6); break; } return 0; diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c index 95516138e861..d6189c2a35e4 100644 --- a/net/ipv6/exthdrs.c +++ b/net/ipv6/exthdrs.c @@ -884,6 +884,15 @@ static void ipv6_push_rthdr4(struct sk_buff *skb, u8 *proto, sr_phdr->segments[0] = **addr_p; *addr_p = &sr_ihdr->segments[sr_ihdr->segments_left]; + if (sr_ihdr->hdrlen > hops * 2) { + int tlvs_offset, tlvs_length; + + tlvs_offset = (1 + hops * 2) << 3; + tlvs_length = (sr_ihdr->hdrlen - hops * 2) << 3; + memcpy((char *)sr_phdr + tlvs_offset, + (char *)sr_ihdr + tlvs_offset, tlvs_length); + } + #ifdef CONFIG_IPV6_SEG6_HMAC if (sr_has_hmac(sr_phdr)) { struct net *net = NULL; diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index f7dd51c42314..688ba5f7516b 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1735,9 +1735,10 @@ struct sk_buff *ip6_make_skb(struct sock *sk, cork.base.opt = NULL; v6_cork.opt = NULL; err = ip6_setup_cork(sk, &cork, &v6_cork, ipc6, rt, fl6); - if (err) + if (err) { + ip6_cork_release(&cork, &v6_cork); return ERR_PTR(err); - + } if (ipc6->dontfrag < 0) ipc6->dontfrag = inet6_sk(sk)->dontfrag; diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index ef958d50746b..3f46121ad139 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1081,10 +1081,11 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield, memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr)); neigh_release(neigh); } - } else if (!(t->parms.flags & - (IP6_TNL_F_USE_ORIG_TCLASS | IP6_TNL_F_USE_ORIG_FWMARK))) { - /* enable the cache only only if the routing decision does - * not depend on the current inner header value + } else if (t->parms.proto != 0 && !(t->parms.flags & + (IP6_TNL_F_USE_ORIG_TCLASS | + IP6_TNL_F_USE_ORIG_FWMARK))) { + /* enable the cache only if neither the outer protocol nor the + * routing decision depends on the current inner header value */ use_cache = true; } diff --git a/net/rds/rdma.c b/net/rds/rdma.c index bc2f1e0977d6..634cfcb7bba6 100644 --- a/net/rds/rdma.c +++ b/net/rds/rdma.c @@ -525,6 +525,9 @@ int rds_rdma_extra_size(struct rds_rdma_args *args) local_vec = (struct rds_iovec __user *)(unsigned long) args->local_vec_addr; + if (args->nr_local == 0) + return -EINVAL; + /* figure out the number of pages in the vector */ for (i = 0; i < args->nr_local; i++) { if (copy_from_user(&vec, &local_vec[i], @@ -874,6 +877,7 @@ int rds_cmsg_atomic(struct rds_sock *rs, struct rds_message *rm, err: if (page) put_page(page); + rm->atomic.op_active = 0; kfree(rm->atomic.op_notifier); return ret; diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c index e29a48ef7fc3..a0ac42b3ed06 100644 --- a/net/sched/act_gact.c +++ b/net/sched/act_gact.c @@ -159,7 +159,7 @@ static void tcf_gact_stats_update(struct tc_action *a, u64 bytes, u32 packets, if (action == TC_ACT_SHOT) this_cpu_ptr(gact->common.cpu_qstats)->drops += packets; - tm->lastuse = lastuse; + tm->lastuse = max_t(u64, tm->lastuse, lastuse); } static int tcf_gact_dump(struct sk_buff *skb, struct tc_action *a, diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index 416627c66f08..6ce8de373f83 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -238,7 +238,7 @@ static void tcf_stats_update(struct tc_action *a, u64 bytes, u32 packets, struct tcf_t *tm = &m->tcf_tm; _bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets); - tm->lastuse = lastuse; + tm->lastuse = max_t(u64, tm->lastuse, lastuse); } static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind, diff --git a/net/sctp/input.c b/net/sctp/input.c index 621b5ca3fd1c..141c9c466ec1 100644 --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -399,20 +399,24 @@ void sctp_icmp_frag_needed(struct sock *sk, struct sctp_association *asoc, return; } - if (t->param_flags & SPP_PMTUD_ENABLE) { - /* Update transports view of the MTU */ - sctp_transport_update_pmtu(t, pmtu); - - /* Update association pmtu. */ - sctp_assoc_sync_pmtu(asoc); - } + if (!(t->param_flags & SPP_PMTUD_ENABLE)) + /* We can't allow retransmitting in such case, as the + * retransmission would be sized just as before, and thus we + * would get another icmp, and retransmit again. + */ + return; - /* Retransmit with the new pmtu setting. - * Normally, if PMTU discovery is disabled, an ICMP Fragmentation - * Needed will never be sent, but if a message was sent before - * PMTU discovery was disabled that was larger than the PMTU, it - * would not be fragmented, so it must be re-transmitted fragmented. + /* Update transports view of the MTU. Return if no update was needed. + * If an update wasn't needed/possible, it also doesn't make sense to + * try to retransmit now. */ + if (!sctp_transport_update_pmtu(t, pmtu)) + return; + + /* Update association pmtu. */ + sctp_assoc_sync_pmtu(asoc); + + /* Retransmit with the new pmtu setting. */ sctp_retransmit(&asoc->outqueue, t, SCTP_RTXR_PMTUD); } diff --git a/net/sctp/transport.c b/net/sctp/transport.c index 2d9bd3776bc8..7ef77fd7b52a 100644 --- a/net/sctp/transport.c +++ b/net/sctp/transport.c @@ -251,28 +251,37 @@ void sctp_transport_pmtu(struct sctp_transport *transport, struct sock *sk) transport->pathmtu = SCTP_DEFAULT_MAXSEGMENT; } -void sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu) +bool sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu) { struct dst_entry *dst = sctp_transport_dst_check(t); + bool change = true; if (unlikely(pmtu < SCTP_DEFAULT_MINSEGMENT)) { - pr_warn("%s: Reported pmtu %d too low, using default minimum of %d\n", - __func__, pmtu, SCTP_DEFAULT_MINSEGMENT); - /* Use default minimum segment size and disable - * pmtu discovery on this transport. - */ - t->pathmtu = SCTP_DEFAULT_MINSEGMENT; - } else { - t->pathmtu = pmtu; + pr_warn_ratelimited("%s: Reported pmtu %d too low, using default minimum of %d\n", + __func__, pmtu, SCTP_DEFAULT_MINSEGMENT); + /* Use default minimum segment instead */ + pmtu = SCTP_DEFAULT_MINSEGMENT; } + pmtu = SCTP_TRUNC4(pmtu); if (dst) { dst->ops->update_pmtu(dst, t->asoc->base.sk, NULL, pmtu); dst = sctp_transport_dst_check(t); } - if (!dst) + if (!dst) { t->af_specific->get_dst(t, &t->saddr, &t->fl, t->asoc->base.sk); + dst = t->dst; + } + + if (dst) { + /* Re-fetch, as under layers may have a higher minimum size */ + pmtu = SCTP_TRUNC4(dst_mtu(dst)); + change = t->pathmtu != pmtu; + } + t->pathmtu = pmtu; + + return change; } /* Caches the dst entry and source address for a transport's destination diff --git a/security/Kconfig b/security/Kconfig index 6614b9312b45..b5c2b5d0c6c0 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -63,7 +63,7 @@ config PAGE_TABLE_ISOLATION ensuring that the majority of kernel addresses are not mapped into userspace. - See Documentation/x86/pagetable-isolation.txt for more details. + See Documentation/x86/pti.txt for more details. config SECURITY_INFINIBAND bool "Infiniband Security Hooks" diff --git a/security/apparmor/include/perms.h b/security/apparmor/include/perms.h index 2b27bb79aec4..d7b7e7115160 100644 --- a/security/apparmor/include/perms.h +++ b/security/apparmor/include/perms.h @@ -133,6 +133,9 @@ extern struct aa_perms allperms; #define xcheck_labels_profiles(L1, L2, FN, args...) \ xcheck_ns_labels((L1), (L2), xcheck_ns_profile_label, (FN), args) +#define xcheck_labels(L1, L2, P, FN1, FN2) \ + xcheck(fn_for_each((L1), (P), (FN1)), fn_for_each((L2), (P), (FN2))) + void aa_perm_mask_to_str(char *str, const char *chrs, u32 mask); void aa_audit_perm_names(struct audit_buffer *ab, const char **names, u32 mask); diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c index 7ca0032e7ba9..b40678f3c1d5 100644 --- a/security/apparmor/ipc.c +++ b/security/apparmor/ipc.c @@ -64,40 +64,48 @@ static void audit_ptrace_cb(struct audit_buffer *ab, void *va) FLAGS_NONE, GFP_ATOMIC); } +/* assumes check for PROFILE_MEDIATES is already done */ /* TODO: conditionals */ static int profile_ptrace_perm(struct aa_profile *profile, - struct aa_profile *peer, u32 request, - struct common_audit_data *sa) + struct aa_label *peer, u32 request, + struct common_audit_data *sa) { struct aa_perms perms = { }; - /* need because of peer in cross check */ - if (profile_unconfined(profile) || - !PROFILE_MEDIATES(profile, AA_CLASS_PTRACE)) - return 0; - - aad(sa)->peer = &peer->label; - aa_profile_match_label(profile, &peer->label, AA_CLASS_PTRACE, request, + aad(sa)->peer = peer; + aa_profile_match_label(profile, peer, AA_CLASS_PTRACE, request, &perms); aa_apply_modes_to_perms(profile, &perms); return aa_check_perms(profile, &perms, request, sa, audit_ptrace_cb); } -static int cross_ptrace_perm(struct aa_profile *tracer, - struct aa_profile *tracee, u32 request, - struct common_audit_data *sa) +static int profile_tracee_perm(struct aa_profile *tracee, + struct aa_label *tracer, u32 request, + struct common_audit_data *sa) { + if (profile_unconfined(tracee) || unconfined(tracer) || + !PROFILE_MEDIATES(tracee, AA_CLASS_PTRACE)) + return 0; + + return profile_ptrace_perm(tracee, tracer, request, sa); +} + +static int profile_tracer_perm(struct aa_profile *tracer, + struct aa_label *tracee, u32 request, + struct common_audit_data *sa) +{ + if (profile_unconfined(tracer)) + return 0; + if (PROFILE_MEDIATES(tracer, AA_CLASS_PTRACE)) - return xcheck(profile_ptrace_perm(tracer, tracee, request, sa), - profile_ptrace_perm(tracee, tracer, - request << PTRACE_PERM_SHIFT, - sa)); - /* policy uses the old style capability check for ptrace */ - if (profile_unconfined(tracer) || tracer == tracee) + return profile_ptrace_perm(tracer, tracee, request, sa); + + /* profile uses the old style capability check for ptrace */ + if (&tracer->label == tracee) return 0; aad(sa)->label = &tracer->label; - aad(sa)->peer = &tracee->label; + aad(sa)->peer = tracee; aad(sa)->request = 0; aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1); @@ -115,10 +123,13 @@ static int cross_ptrace_perm(struct aa_profile *tracer, int aa_may_ptrace(struct aa_label *tracer, struct aa_label *tracee, u32 request) { + struct aa_profile *profile; + u32 xrequest = request << PTRACE_PERM_SHIFT; DEFINE_AUDIT_DATA(sa, LSM_AUDIT_DATA_NONE, OP_PTRACE); - return xcheck_labels_profiles(tracer, tracee, cross_ptrace_perm, - request, &sa); + return xcheck_labels(tracer, tracee, profile, + profile_tracer_perm(profile, tracee, request, &sa), + profile_tracee_perm(profile, tracer, xrequest, &sa)); } diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c index e49f448ee04f..c2db7e905f7d 100644 --- a/sound/core/oss/pcm_oss.c +++ b/sound/core/oss/pcm_oss.c @@ -455,7 +455,6 @@ static int snd_pcm_hw_param_near(struct snd_pcm_substream *pcm, v = snd_pcm_hw_param_last(pcm, params, var, dir); else v = snd_pcm_hw_param_first(pcm, params, var, dir); - snd_BUG_ON(v < 0); return v; } @@ -1335,8 +1334,11 @@ static ssize_t snd_pcm_oss_write1(struct snd_pcm_substream *substream, const cha if ((tmp = snd_pcm_oss_make_ready(substream)) < 0) return tmp; - mutex_lock(&runtime->oss.params_lock); while (bytes > 0) { + if (mutex_lock_interruptible(&runtime->oss.params_lock)) { + tmp = -ERESTARTSYS; + break; + } if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) { tmp = bytes; if (tmp + runtime->oss.buffer_used > runtime->oss.period_bytes) @@ -1380,14 +1382,18 @@ static ssize_t snd_pcm_oss_write1(struct snd_pcm_substream *substream, const cha xfer += tmp; if ((substream->f_flags & O_NONBLOCK) != 0 && tmp != runtime->oss.period_bytes) - break; + tmp = -EAGAIN; } - } - mutex_unlock(&runtime->oss.params_lock); - return xfer; - err: - mutex_unlock(&runtime->oss.params_lock); + mutex_unlock(&runtime->oss.params_lock); + if (tmp < 0) + break; + if (signal_pending(current)) { + tmp = -ERESTARTSYS; + break; + } + tmp = 0; + } return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp; } @@ -1435,8 +1441,11 @@ static ssize_t snd_pcm_oss_read1(struct snd_pcm_substream *substream, char __use if ((tmp = snd_pcm_oss_make_ready(substream)) < 0) return tmp; - mutex_lock(&runtime->oss.params_lock); while (bytes > 0) { + if (mutex_lock_interruptible(&runtime->oss.params_lock)) { + tmp = -ERESTARTSYS; + break; + } if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) { if (runtime->oss.buffer_used == 0) { tmp = snd_pcm_oss_read2(substream, runtime->oss.buffer, runtime->oss.period_bytes, 1); @@ -1467,12 +1476,16 @@ static ssize_t snd_pcm_oss_read1(struct snd_pcm_substream *substream, char __use bytes -= tmp; xfer += tmp; } - } - mutex_unlock(&runtime->oss.params_lock); - return xfer; - err: - mutex_unlock(&runtime->oss.params_lock); + mutex_unlock(&runtime->oss.params_lock); + if (tmp < 0) + break; + if (signal_pending(current)) { + tmp = -ERESTARTSYS; + break; + } + tmp = 0; + } return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp; } diff --git a/sound/core/oss/pcm_plugin.c b/sound/core/oss/pcm_plugin.c index cadc93792868..85a56af104bd 100644 --- a/sound/core/oss/pcm_plugin.c +++ b/sound/core/oss/pcm_plugin.c @@ -592,18 +592,26 @@ snd_pcm_sframes_t snd_pcm_plug_write_transfer(struct snd_pcm_substream *plug, st snd_pcm_sframes_t frames = size; plugin = snd_pcm_plug_first(plug); - while (plugin && frames > 0) { + while (plugin) { + if (frames <= 0) + return frames; if ((next = plugin->next) != NULL) { snd_pcm_sframes_t frames1 = frames; - if (plugin->dst_frames) + if (plugin->dst_frames) { frames1 = plugin->dst_frames(plugin, frames); + if (frames1 <= 0) + return frames1; + } if ((err = next->client_channels(next, frames1, &dst_channels)) < 0) { return err; } if (err != frames1) { frames = err; - if (plugin->src_frames) + if (plugin->src_frames) { frames = plugin->src_frames(plugin, frames1); + if (frames <= 0) + return frames; + } } } else dst_channels = NULL; diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index 10e7ef7a8804..db7894bb028c 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -1632,7 +1632,7 @@ int snd_pcm_hw_param_first(struct snd_pcm_substream *pcm, return changed; if (params->rmask) { int err = snd_pcm_hw_refine(pcm, params); - if (snd_BUG_ON(err < 0)) + if (err < 0) return err; } return snd_pcm_hw_param_value(params, var, dir); @@ -1678,7 +1678,7 @@ int snd_pcm_hw_param_last(struct snd_pcm_substream *pcm, return changed; if (params->rmask) { int err = snd_pcm_hw_refine(pcm, params); - if (snd_BUG_ON(err < 0)) + if (err < 0) return err; } return snd_pcm_hw_param_value(params, var, dir); diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 2fec2feac387..499f75b18e09 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -2582,7 +2582,7 @@ static snd_pcm_sframes_t forward_appl_ptr(struct snd_pcm_substream *substream, return ret < 0 ? ret : frames; } -/* decrease the appl_ptr; returns the processed frames or a negative error */ +/* decrease the appl_ptr; returns the processed frames or zero for error */ static snd_pcm_sframes_t rewind_appl_ptr(struct snd_pcm_substream *substream, snd_pcm_uframes_t frames, snd_pcm_sframes_t avail) @@ -2599,7 +2599,12 @@ static snd_pcm_sframes_t rewind_appl_ptr(struct snd_pcm_substream *substream, if (appl_ptr < 0) appl_ptr += runtime->boundary; ret = pcm_lib_apply_appl_ptr(substream, appl_ptr); - return ret < 0 ? ret : frames; + /* NOTE: we return zero for errors because PulseAudio gets depressed + * upon receiving an error from rewind ioctl and stops processing + * any longer. Returning zero means that no rewind is done, so + * it's not absolutely wrong to answer like that. + */ + return ret < 0 ? 0 : frames; } static snd_pcm_sframes_t snd_pcm_playback_rewind(struct snd_pcm_substream *substream, diff --git a/sound/drivers/aloop.c b/sound/drivers/aloop.c index 135adb17703c..386ee829c655 100644 --- a/sound/drivers/aloop.c +++ b/sound/drivers/aloop.c @@ -39,6 +39,7 @@ #include <sound/core.h> #include <sound/control.h> #include <sound/pcm.h> +#include <sound/pcm_params.h> #include <sound/info.h> #include <sound/initval.h> @@ -305,19 +306,6 @@ static int loopback_trigger(struct snd_pcm_substream *substream, int cmd) return 0; } -static void params_change_substream(struct loopback_pcm *dpcm, - struct snd_pcm_runtime *runtime) -{ - struct snd_pcm_runtime *dst_runtime; - - if (dpcm == NULL || dpcm->substream == NULL) - return; - dst_runtime = dpcm->substream->runtime; - if (dst_runtime == NULL) - return; - dst_runtime->hw = dpcm->cable->hw; -} - static void params_change(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; @@ -329,10 +317,6 @@ static void params_change(struct snd_pcm_substream *substream) cable->hw.rate_max = runtime->rate; cable->hw.channels_min = runtime->channels; cable->hw.channels_max = runtime->channels; - params_change_substream(cable->streams[SNDRV_PCM_STREAM_PLAYBACK], - runtime); - params_change_substream(cable->streams[SNDRV_PCM_STREAM_CAPTURE], - runtime); } static int loopback_prepare(struct snd_pcm_substream *substream) @@ -620,26 +604,29 @@ static unsigned int get_cable_index(struct snd_pcm_substream *substream) static int rule_format(struct snd_pcm_hw_params *params, struct snd_pcm_hw_rule *rule) { + struct loopback_pcm *dpcm = rule->private; + struct loopback_cable *cable = dpcm->cable; + struct snd_mask m; - struct snd_pcm_hardware *hw = rule->private; - struct snd_mask *maskp = hw_param_mask(params, rule->var); - - maskp->bits[0] &= (u_int32_t)hw->formats; - maskp->bits[1] &= (u_int32_t)(hw->formats >> 32); - memset(maskp->bits + 2, 0, (SNDRV_MASK_MAX-64) / 8); /* clear rest */ - if (! maskp->bits[0] && ! maskp->bits[1]) - return -EINVAL; - return 0; + snd_mask_none(&m); + mutex_lock(&dpcm->loopback->cable_lock); + m.bits[0] = (u_int32_t)cable->hw.formats; + m.bits[1] = (u_int32_t)(cable->hw.formats >> 32); + mutex_unlock(&dpcm->loopback->cable_lock); + return snd_mask_refine(hw_param_mask(params, rule->var), &m); } static int rule_rate(struct snd_pcm_hw_params *params, struct snd_pcm_hw_rule *rule) { - struct snd_pcm_hardware *hw = rule->private; + struct loopback_pcm *dpcm = rule->private; + struct loopback_cable *cable = dpcm->cable; struct snd_interval t; - t.min = hw->rate_min; - t.max = hw->rate_max; + mutex_lock(&dpcm->loopback->cable_lock); + t.min = cable->hw.rate_min; + t.max = cable->hw.rate_max; + mutex_unlock(&dpcm->loopback->cable_lock); t.openmin = t.openmax = 0; t.integer = 0; return snd_interval_refine(hw_param_interval(params, rule->var), &t); @@ -648,22 +635,44 @@ static int rule_rate(struct snd_pcm_hw_params *params, static int rule_channels(struct snd_pcm_hw_params *params, struct snd_pcm_hw_rule *rule) { - struct snd_pcm_hardware *hw = rule->private; + struct loopback_pcm *dpcm = rule->private; + struct loopback_cable *cable = dpcm->cable; struct snd_interval t; - t.min = hw->channels_min; - t.max = hw->channels_max; + mutex_lock(&dpcm->loopback->cable_lock); + t.min = cable->hw.channels_min; + t.max = cable->hw.channels_max; + mutex_unlock(&dpcm->loopback->cable_lock); t.openmin = t.openmax = 0; t.integer = 0; return snd_interval_refine(hw_param_interval(params, rule->var), &t); } +static void free_cable(struct snd_pcm_substream *substream) +{ + struct loopback *loopback = substream->private_data; + int dev = get_cable_index(substream); + struct loopback_cable *cable; + + cable = loopback->cables[substream->number][dev]; + if (!cable) + return; + if (cable->streams[!substream->stream]) { + /* other stream is still alive */ + cable->streams[substream->stream] = NULL; + } else { + /* free the cable */ + loopback->cables[substream->number][dev] = NULL; + kfree(cable); + } +} + static int loopback_open(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct loopback *loopback = substream->private_data; struct loopback_pcm *dpcm; - struct loopback_cable *cable; + struct loopback_cable *cable = NULL; int err = 0; int dev = get_cable_index(substream); @@ -682,7 +691,6 @@ static int loopback_open(struct snd_pcm_substream *substream) if (!cable) { cable = kzalloc(sizeof(*cable), GFP_KERNEL); if (!cable) { - kfree(dpcm); err = -ENOMEM; goto unlock; } @@ -700,19 +708,19 @@ static int loopback_open(struct snd_pcm_substream *substream) /* are cached -> they do not reflect the actual state */ err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_FORMAT, - rule_format, &runtime->hw, + rule_format, dpcm, SNDRV_PCM_HW_PARAM_FORMAT, -1); if (err < 0) goto unlock; err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_RATE, - rule_rate, &runtime->hw, + rule_rate, dpcm, SNDRV_PCM_HW_PARAM_RATE, -1); if (err < 0) goto unlock; err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS, - rule_channels, &runtime->hw, + rule_channels, dpcm, SNDRV_PCM_HW_PARAM_CHANNELS, -1); if (err < 0) goto unlock; @@ -724,6 +732,10 @@ static int loopback_open(struct snd_pcm_substream *substream) else runtime->hw = cable->hw; unlock: + if (err < 0) { + free_cable(substream); + kfree(dpcm); + } mutex_unlock(&loopback->cable_lock); return err; } @@ -732,20 +744,10 @@ static int loopback_close(struct snd_pcm_substream *substream) { struct loopback *loopback = substream->private_data; struct loopback_pcm *dpcm = substream->runtime->private_data; - struct loopback_cable *cable; - int dev = get_cable_index(substream); loopback_timer_stop(dpcm); mutex_lock(&loopback->cable_lock); - cable = loopback->cables[substream->number][dev]; - if (cable->streams[!substream->stream]) { - /* other stream is still alive */ - cable->streams[substream->stream] = NULL; - } else { - /* free the cable */ - loopback->cables[substream->number][dev] = NULL; - kfree(cable); - } + free_cable(substream); mutex_unlock(&loopback->cable_lock); return 0; } diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 9b341584eb1b..f40d46e24bcc 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -427,6 +427,40 @@ static void add_ignores(struct objtool_file *file) } } +/* + * FIXME: For now, just ignore any alternatives which add retpolines. This is + * a temporary hack, as it doesn't allow ORC to unwind from inside a retpoline. + * But it at least allows objtool to understand the control flow *around* the + * retpoline. + */ +static int add_nospec_ignores(struct objtool_file *file) +{ + struct section *sec; + struct rela *rela; + struct instruction *insn; + + sec = find_section_by_name(file->elf, ".rela.discard.nospec"); + if (!sec) + return 0; + + list_for_each_entry(rela, &sec->rela_list, list) { + if (rela->sym->type != STT_SECTION) { + WARN("unexpected relocation symbol type in %s", sec->name); + return -1; + } + + insn = find_insn(file, rela->sym->sec, rela->addend); + if (!insn) { + WARN("bad .discard.nospec entry"); + return -1; + } + + insn->ignore_alts = true; + } + + return 0; +} + /* * Find the destination instructions for all jumps. */ @@ -456,6 +490,13 @@ static int add_jump_destinations(struct objtool_file *file) } else if (rela->sym->sec->idx) { dest_sec = rela->sym->sec; dest_off = rela->sym->sym.st_value + rela->addend + 4; + } else if (strstr(rela->sym->name, "_indirect_thunk_")) { + /* + * Retpoline jumps are really dynamic jumps in + * disguise, so convert them accordingly. + */ + insn->type = INSN_JUMP_DYNAMIC; + continue; } else { /* sibling call */ insn->jump_dest = 0; @@ -502,11 +543,18 @@ static int add_call_destinations(struct objtool_file *file) dest_off = insn->offset + insn->len + insn->immediate; insn->call_dest = find_symbol_by_offset(insn->sec, dest_off); + /* + * FIXME: Thanks to retpolines, it's now considered + * normal for a function to call within itself. So + * disable this warning for now. + */ +#if 0 if (!insn->call_dest) { WARN_FUNC("can't find call dest symbol at offset 0x%lx", insn->sec, insn->offset, dest_off); return -1; } +#endif } else if (rela->sym->type == STT_SECTION) { insn->call_dest = find_symbol_by_offset(rela->sym->sec, rela->addend+4); @@ -671,12 +719,6 @@ static int add_special_section_alts(struct objtool_file *file) return ret; list_for_each_entry_safe(special_alt, tmp, &special_alts, list) { - alt = malloc(sizeof(*alt)); - if (!alt) { - WARN("malloc failed"); - ret = -1; - goto out; - } orig_insn = find_insn(file, special_alt->orig_sec, special_alt->orig_off); @@ -687,6 +729,10 @@ static int add_special_section_alts(struct objtool_file *file) goto out; } + /* Ignore retpoline alternatives. */ + if (orig_insn->ignore_alts) + continue; + new_insn = NULL; if (!special_alt->group || special_alt->new_len) { new_insn = find_insn(file, special_alt->new_sec, @@ -712,6 +758,13 @@ static int add_special_section_alts(struct objtool_file *file) goto out; } + alt = malloc(sizeof(*alt)); + if (!alt) { + WARN("malloc failed"); + ret = -1; + goto out; + } + alt->insn = new_insn; list_add_tail(&alt->list, &orig_insn->alts); @@ -1028,6 +1081,10 @@ static int decode_sections(struct objtool_file *file) add_ignores(file); + ret = add_nospec_ignores(file); + if (ret) + return ret; + ret = add_jump_destinations(file); if (ret) return ret; diff --git a/tools/objtool/check.h b/tools/objtool/check.h index 47d9ea70a83d..dbadb304a410 100644 --- a/tools/objtool/check.h +++ b/tools/objtool/check.h @@ -44,7 +44,7 @@ struct instruction { unsigned int len; unsigned char type; unsigned long immediate; - bool alt_group, visited, dead_end, ignore, hint, save, restore; + bool alt_group, visited, dead_end, ignore, hint, save, restore, ignore_alts; struct symbol *call_dest; struct instruction *jump_dest; struct list_head alts; diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 7a2d221c4702..1241487de93f 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -272,6 +272,46 @@ static struct bpf_test tests[] = { .errstr = "invalid bpf_ld_imm64 insn", .result = REJECT, }, + { + "arsh32 on imm", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5), + BPF_EXIT_INSN(), + }, + .result = REJECT, + .errstr = "BPF_ARSH not supported for 32 bit ALU", + }, + { + "arsh32 on reg", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_MOV64_IMM(BPF_REG_1, 5), + BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1), + BPF_EXIT_INSN(), + }, + .result = REJECT, + .errstr = "BPF_ARSH not supported for 32 bit ALU", + }, + { + "arsh64 on imm", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_ALU64_IMM(BPF_ARSH, BPF_REG_0, 5), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + }, + { + "arsh64 on reg", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 1), + BPF_MOV64_IMM(BPF_REG_1, 5), + BPF_ALU64_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + }, { "no bpf_exit", .insns = { diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile index 7b1adeee4b0f..91fbfa8fdc15 100644 --- a/tools/testing/selftests/x86/Makefile +++ b/tools/testing/selftests/x86/Makefile @@ -7,7 +7,7 @@ include ../lib.mk TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall test_mremap_vdso \ check_initial_reg_state sigreturn ldt_gdt iopl mpx-mini-test ioperm \ - protection_keys test_vdso + protection_keys test_vdso test_vsyscall TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso unwind_vdso \ test_FCMOV test_FCOMI test_FISTTP \ vdso_restorer diff --git a/tools/testing/selftests/x86/test_vsyscall.c b/tools/testing/selftests/x86/test_vsyscall.c new file mode 100644 index 000000000000..6e0bd52ad53d --- /dev/null +++ b/tools/testing/selftests/x86/test_vsyscall.c @@ -0,0 +1,500 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#define _GNU_SOURCE + +#include <stdio.h> +#include <sys/time.h> +#include <time.h> +#include <stdlib.h> +#include <sys/syscall.h> +#include <unistd.h> +#include <dlfcn.h> +#include <string.h> +#include <inttypes.h> +#include <signal.h> +#include <sys/ucontext.h> +#include <errno.h> +#include <err.h> +#include <sched.h> +#include <stdbool.h> +#include <setjmp.h> + +#ifdef __x86_64__ +# define VSYS(x) (x) +#else +# define VSYS(x) 0 +#endif + +#ifndef SYS_getcpu +# ifdef __x86_64__ +# define SYS_getcpu 309 +# else +# define SYS_getcpu 318 +# endif +#endif + +static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *), + int flags) +{ + struct sigaction sa; + memset(&sa, 0, sizeof(sa)); + sa.sa_sigaction = handler; + sa.sa_flags = SA_SIGINFO | flags; + sigemptyset(&sa.sa_mask); + if (sigaction(sig, &sa, 0)) + err(1, "sigaction"); +} + +/* vsyscalls and vDSO */ +bool should_read_vsyscall = false; + +typedef long (*gtod_t)(struct timeval *tv, struct timezone *tz); +gtod_t vgtod = (gtod_t)VSYS(0xffffffffff600000); +gtod_t vdso_gtod; + +typedef int (*vgettime_t)(clockid_t, struct timespec *); +vgettime_t vdso_gettime; + +typedef long (*time_func_t)(time_t *t); +time_func_t vtime = (time_func_t)VSYS(0xffffffffff600400); +time_func_t vdso_time; + +typedef long (*getcpu_t)(unsigned *, unsigned *, void *); +getcpu_t vgetcpu = (getcpu_t)VSYS(0xffffffffff600800); +getcpu_t vdso_getcpu; + +static void init_vdso(void) +{ + void *vdso = dlopen("linux-vdso.so.1", RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD); + if (!vdso) + vdso = dlopen("linux-gate.so.1", RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD); + if (!vdso) { + printf("[WARN]\tfailed to find vDSO\n"); + return; + } + + vdso_gtod = (gtod_t)dlsym(vdso, "__vdso_gettimeofday"); + if (!vdso_gtod) + printf("[WARN]\tfailed to find gettimeofday in vDSO\n"); + + vdso_gettime = (vgettime_t)dlsym(vdso, "__vdso_clock_gettime"); + if (!vdso_gettime) + printf("[WARN]\tfailed to find clock_gettime in vDSO\n"); + + vdso_time = (time_func_t)dlsym(vdso, "__vdso_time"); + if (!vdso_time) + printf("[WARN]\tfailed to find time in vDSO\n"); + + vdso_getcpu = (getcpu_t)dlsym(vdso, "__vdso_getcpu"); + if (!vdso_getcpu) { + /* getcpu() was never wired up in the 32-bit vDSO. */ + printf("[%s]\tfailed to find getcpu in vDSO\n", + sizeof(long) == 8 ? "WARN" : "NOTE"); + } +} + +static int init_vsys(void) +{ +#ifdef __x86_64__ + int nerrs = 0; + FILE *maps; + char line[128]; + bool found = false; + + maps = fopen("/proc/self/maps", "r"); + if (!maps) { + printf("[WARN]\tCould not open /proc/self/maps -- assuming vsyscall is r-x\n"); + should_read_vsyscall = true; + return 0; + } + + while (fgets(line, sizeof(line), maps)) { + char r, x; + void *start, *end; + char name[128]; + if (sscanf(line, "%p-%p %c-%cp %*x %*x:%*x %*u %s", + &start, &end, &r, &x, name) != 5) + continue; + + if (strcmp(name, "[vsyscall]")) + continue; + + printf("\tvsyscall map: %s", line); + + if (start != (void *)0xffffffffff600000 || + end != (void *)0xffffffffff601000) { + printf("[FAIL]\taddress range is nonsense\n"); + nerrs++; + } + + printf("\tvsyscall permissions are %c-%c\n", r, x); + should_read_vsyscall = (r == 'r'); + if (x != 'x') { + vgtod = NULL; + vtime = NULL; + vgetcpu = NULL; + } + + found = true; + break; + } + + fclose(maps); + + if (!found) { + printf("\tno vsyscall map in /proc/self/maps\n"); + should_read_vsyscall = false; + vgtod = NULL; + vtime = NULL; + vgetcpu = NULL; + } + + return nerrs; +#else + return 0; +#endif +} + +/* syscalls */ +static inline long sys_gtod(struct timeval *tv, struct timezone *tz) +{ + return syscall(SYS_gettimeofday, tv, tz); +} + +static inline int sys_clock_gettime(clockid_t id, struct timespec *ts) +{ + return syscall(SYS_clock_gettime, id, ts); +} + +static inline long sys_time(time_t *t) +{ + return syscall(SYS_time, t); +} + +static inline long sys_getcpu(unsigned * cpu, unsigned * node, + void* cache) +{ + return syscall(SYS_getcpu, cpu, node, cache); +} + +static jmp_buf jmpbuf; + +static void sigsegv(int sig, siginfo_t *info, void *ctx_void) +{ + siglongjmp(jmpbuf, 1); +} + +static double tv_diff(const struct timeval *a, const struct timeval *b) +{ + return (double)(a->tv_sec - b->tv_sec) + + (double)((int)a->tv_usec - (int)b->tv_usec) * 1e-6; +} + +static int check_gtod(const struct timeval *tv_sys1, + const struct timeval *tv_sys2, + const struct timezone *tz_sys, + const char *which, + const struct timeval *tv_other, + const struct timezone *tz_other) +{ + int nerrs = 0; + double d1, d2; + + if (tz_other && (tz_sys->tz_minuteswest != tz_other->tz_minuteswest || tz_sys->tz_dsttime != tz_other->tz_dsttime)) { + printf("[FAIL] %s tz mismatch\n", which); + nerrs++; + } + + d1 = tv_diff(tv_other, tv_sys1); + d2 = tv_diff(tv_sys2, tv_other); + printf("\t%s time offsets: %lf %lf\n", which, d1, d2); + + if (d1 < 0 || d2 < 0) { + printf("[FAIL]\t%s time was inconsistent with the syscall\n", which); + nerrs++; + } else { + printf("[OK]\t%s gettimeofday()'s timeval was okay\n", which); + } + + return nerrs; +} + +static int test_gtod(void) +{ + struct timeval tv_sys1, tv_sys2, tv_vdso, tv_vsys; + struct timezone tz_sys, tz_vdso, tz_vsys; + long ret_vdso = -1; + long ret_vsys = -1; + int nerrs = 0; + + printf("[RUN]\ttest gettimeofday()\n"); + + if (sys_gtod(&tv_sys1, &tz_sys) != 0) + err(1, "syscall gettimeofday"); + if (vdso_gtod) + ret_vdso = vdso_gtod(&tv_vdso, &tz_vdso); + if (vgtod) + ret_vsys = vgtod(&tv_vsys, &tz_vsys); + if (sys_gtod(&tv_sys2, &tz_sys) != 0) + err(1, "syscall gettimeofday"); + + if (vdso_gtod) { + if (ret_vdso == 0) { + nerrs += check_gtod(&tv_sys1, &tv_sys2, &tz_sys, "vDSO", &tv_vdso, &tz_vdso); + } else { + printf("[FAIL]\tvDSO gettimeofday() failed: %ld\n", ret_vdso); + nerrs++; + } + } + + if (vgtod) { + if (ret_vsys == 0) { + nerrs += check_gtod(&tv_sys1, &tv_sys2, &tz_sys, "vsyscall", &tv_vsys, &tz_vsys); + } else { + printf("[FAIL]\tvsys gettimeofday() failed: %ld\n", ret_vsys); + nerrs++; + } + } + + return nerrs; +} + +static int test_time(void) { + int nerrs = 0; + + printf("[RUN]\ttest time()\n"); + long t_sys1, t_sys2, t_vdso = 0, t_vsys = 0; + long t2_sys1 = -1, t2_sys2 = -1, t2_vdso = -1, t2_vsys = -1; + t_sys1 = sys_time(&t2_sys1); + if (vdso_time) + t_vdso = vdso_time(&t2_vdso); + if (vtime) + t_vsys = vtime(&t2_vsys); + t_sys2 = sys_time(&t2_sys2); + if (t_sys1 < 0 || t_sys1 != t2_sys1 || t_sys2 < 0 || t_sys2 != t2_sys2) { + printf("[FAIL]\tsyscall failed (ret1:%ld output1:%ld ret2:%ld output2:%ld)\n", t_sys1, t2_sys1, t_sys2, t2_sys2); + nerrs++; + return nerrs; + } + + if (vdso_time) { + if (t_vdso < 0 || t_vdso != t2_vdso) { + printf("[FAIL]\tvDSO failed (ret:%ld output:%ld)\n", t_vdso, t2_vdso); + nerrs++; + } else if (t_vdso < t_sys1 || t_vdso > t_sys2) { + printf("[FAIL]\tvDSO returned the wrong time (%ld %ld %ld)\n", t_sys1, t_vdso, t_sys2); + nerrs++; + } else { + printf("[OK]\tvDSO time() is okay\n"); + } + } + + if (vtime) { + if (t_vsys < 0 || t_vsys != t2_vsys) { + printf("[FAIL]\tvsyscall failed (ret:%ld output:%ld)\n", t_vsys, t2_vsys); + nerrs++; + } else if (t_vsys < t_sys1 || t_vsys > t_sys2) { + printf("[FAIL]\tvsyscall returned the wrong time (%ld %ld %ld)\n", t_sys1, t_vsys, t_sys2); + nerrs++; + } else { + printf("[OK]\tvsyscall time() is okay\n"); + } + } + + return nerrs; +} + +static int test_getcpu(int cpu) +{ + int nerrs = 0; + long ret_sys, ret_vdso = -1, ret_vsys = -1; + + printf("[RUN]\tgetcpu() on CPU %d\n", cpu); + + cpu_set_t cpuset; + CPU_ZERO(&cpuset); + CPU_SET(cpu, &cpuset); + if (sched_setaffinity(0, sizeof(cpuset), &cpuset) != 0) { + printf("[SKIP]\tfailed to force CPU %d\n", cpu); + return nerrs; + } + + unsigned cpu_sys, cpu_vdso, cpu_vsys, node_sys, node_vdso, node_vsys; + unsigned node = 0; + bool have_node = false; + ret_sys = sys_getcpu(&cpu_sys, &node_sys, 0); + if (vdso_getcpu) + ret_vdso = vdso_getcpu(&cpu_vdso, &node_vdso, 0); + if (vgetcpu) + ret_vsys = vgetcpu(&cpu_vsys, &node_vsys, 0); + + if (ret_sys == 0) { + if (cpu_sys != cpu) { + printf("[FAIL]\tsyscall reported CPU %hu but should be %d\n", cpu_sys, cpu); + nerrs++; + } + + have_node = true; + node = node_sys; + } + + if (vdso_getcpu) { + if (ret_vdso) { + printf("[FAIL]\tvDSO getcpu() failed\n"); + nerrs++; + } else { + if (!have_node) { + have_node = true; + node = node_vdso; + } + + if (cpu_vdso != cpu) { + printf("[FAIL]\tvDSO reported CPU %hu but should be %d\n", cpu_vdso, cpu); + nerrs++; + } else { + printf("[OK]\tvDSO reported correct CPU\n"); + } + + if (node_vdso != node) { + printf("[FAIL]\tvDSO reported node %hu but should be %hu\n", node_vdso, node); + nerrs++; + } else { + printf("[OK]\tvDSO reported correct node\n"); + } + } + } + + if (vgetcpu) { + if (ret_vsys) { + printf("[FAIL]\tvsyscall getcpu() failed\n"); + nerrs++; + } else { + if (!have_node) { + have_node = true; + node = node_vsys; + } + + if (cpu_vsys != cpu) { + printf("[FAIL]\tvsyscall reported CPU %hu but should be %d\n", cpu_vsys, cpu); + nerrs++; + } else { + printf("[OK]\tvsyscall reported correct CPU\n"); + } + + if (node_vsys != node) { + printf("[FAIL]\tvsyscall reported node %hu but should be %hu\n", node_vsys, node); + nerrs++; + } else { + printf("[OK]\tvsyscall reported correct node\n"); + } + } + } + + return nerrs; +} + +static int test_vsys_r(void) +{ +#ifdef __x86_64__ + printf("[RUN]\tChecking read access to the vsyscall page\n"); + bool can_read; + if (sigsetjmp(jmpbuf, 1) == 0) { + *(volatile int *)0xffffffffff600000; + can_read = true; + } else { + can_read = false; + } + + if (can_read && !should_read_vsyscall) { + printf("[FAIL]\tWe have read access, but we shouldn't\n"); + return 1; + } else if (!can_read && should_read_vsyscall) { + printf("[FAIL]\tWe don't have read access, but we should\n"); + return 1; + } else { + printf("[OK]\tgot expected result\n"); + } +#endif + + return 0; +} + + +#ifdef __x86_64__ +#define X86_EFLAGS_TF (1UL << 8) +static volatile sig_atomic_t num_vsyscall_traps; + +static unsigned long get_eflags(void) +{ + unsigned long eflags; + asm volatile ("pushfq\n\tpopq %0" : "=rm" (eflags)); + return eflags; +} + +static void set_eflags(unsigned long eflags) +{ + asm volatile ("pushq %0\n\tpopfq" : : "rm" (eflags) : "flags"); +} + +static void sigtrap(int sig, siginfo_t *info, void *ctx_void) +{ + ucontext_t *ctx = (ucontext_t *)ctx_void; + unsigned long ip = ctx->uc_mcontext.gregs[REG_RIP]; + + if (((ip ^ 0xffffffffff600000UL) & ~0xfffUL) == 0) + num_vsyscall_traps++; +} + +static int test_native_vsyscall(void) +{ + time_t tmp; + bool is_native; + + if (!vtime) + return 0; + + printf("[RUN]\tchecking for native vsyscall\n"); + sethandler(SIGTRAP, sigtrap, 0); + set_eflags(get_eflags() | X86_EFLAGS_TF); + vtime(&tmp); + set_eflags(get_eflags() & ~X86_EFLAGS_TF); + + /* + * If vsyscalls are emulated, we expect a single trap in the + * vsyscall page -- the call instruction will trap with RIP + * pointing to the entry point before emulation takes over. + * In native mode, we expect two traps, since whatever code + * the vsyscall page contains will be more than just a ret + * instruction. + */ + is_native = (num_vsyscall_traps > 1); + + printf("\tvsyscalls are %s (%d instructions in vsyscall page)\n", + (is_native ? "native" : "emulated"), + (int)num_vsyscall_traps); + + return 0; +} +#endif + +int main(int argc, char **argv) +{ + int nerrs = 0; + + init_vdso(); + nerrs += init_vsys(); + + nerrs += test_gtod(); + nerrs += test_time(); + nerrs += test_getcpu(0); + nerrs += test_getcpu(1); + + sethandler(SIGSEGV, sigsegv, 0); + nerrs += test_vsys_r(); + +#ifdef __x86_64__ + nerrs += test_native_vsyscall(); +#endif + + return nerrs ? 1 : 0; +} diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c index b6e715fd3c90..dac7ceb1a677 100644 --- a/virt/kvm/arm/mmio.c +++ b/virt/kvm/arm/mmio.c @@ -112,7 +112,7 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct kvm_run *run) } trace_kvm_mmio(KVM_TRACE_MMIO_READ, len, run->mmio.phys_addr, - data); + &data); data = vcpu_data_host_to_guest(vcpu, data, len); vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data); } @@ -182,14 +182,14 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run, data = vcpu_data_guest_to_host(vcpu, vcpu_get_reg(vcpu, rt), len); - trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, data); + trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, &data); kvm_mmio_write_buf(data_buf, len, data); ret = kvm_io_bus_write(vcpu, KVM_MMIO_BUS, fault_ipa, len, data_buf); } else { trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, len, - fault_ipa, 0); + fault_ipa, NULL); ret = kvm_io_bus_read(vcpu, KVM_MMIO_BUS, fault_ipa, len, data_buf);

7 years, 8 months

1
0
0 0

Re: [Linux-stable-mirror] Linux 4.9.77

by Greg KH

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index 498741737055..dfd56ec7a850 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -350,3 +350,19 @@ Contact: Linux ARM Kernel Mailing list <linux-arm-kernel(a)lists.infradead.org> Description: AArch64 CPU registers 'identification' directory exposes the CPU ID registers for identifying model and revision of the CPU. + +What: /sys/devices/system/cpu/vulnerabilities + /sys/devices/system/cpu/vulnerabilities/meltdown + /sys/devices/system/cpu/vulnerabilities/spectre_v1 + /sys/devices/system/cpu/vulnerabilities/spectre_v2 +Date: January 2018 +Contact: Linux kernel mailing list <linux-kernel(a)vger.kernel.org> +Description: Information about CPU vulnerabilities + + The files are named after the code names of CPU + vulnerabilities. The output of those files reflects the + state of the CPUs in the system. Possible output values: + + "Not affected" CPU is not affected by the vulnerability + "Vulnerable" CPU is affected and no mitigation in effect + "Mitigation: $M" CPU is affected and mitigation $M is in effect diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 5d2676d043de..4c2667aa4634 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2691,6 +2691,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted. nosmt [KNL,S390] Disable symmetric multithreading (SMT). Equivalent to smt=1. + nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2 + (indirect branch prediction) vulnerability. System may + allow data leaks with this option, which is equivalent + to spectre_v2=off. + noxsave [BUGS=X86] Disables x86 extended register state save and restore using xsave. The kernel will fallback to enabling legacy floating-point and sse state. @@ -2763,8 +2768,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted. nojitter [IA-64] Disables jitter checking for ITC timers. - nopti [X86-64] Disable KAISER isolation of kernel from user. - no-kvmclock [X86,KVM] Disable paravirtualized KVM clock driver no-kvmapf [X86,KVM] Disable paravirtualized asynchronous page @@ -3327,11 +3330,20 @@ bytes respectively. Such letter suffixes can also be entirely omitted. pt. [PARIDE] See Documentation/blockdev/paride.txt. - pti= [X86_64] - Control KAISER user/kernel address space isolation: - on - enable - off - disable - auto - default setting + pti= [X86_64] Control Page Table Isolation of user and + kernel address spaces. Disabling this feature + removes hardening, but improves performance of + system calls and interrupts. + + on - unconditionally enable + off - unconditionally disable + auto - kernel detects whether your CPU model is + vulnerable to issues that PTI mitigates + + Not specifying this option is equivalent to pti=auto. + + nopti [X86_64] + Equivalent to pti=off pty.legacy_count= [KNL] Number of legacy pty's. Overwrites compiled-in @@ -3937,6 +3949,29 @@ bytes respectively. Such letter suffixes can also be entirely omitted. sonypi.*= [HW] Sony Programmable I/O Control Device driver See Documentation/laptops/sonypi.txt + spectre_v2= [X86] Control mitigation of Spectre variant 2 + (indirect branch speculation) vulnerability. + + on - unconditionally enable + off - unconditionally disable + auto - kernel detects whether your CPU model is + vulnerable + + Selecting 'on' will, and 'auto' may, choose a + mitigation method at run time according to the + CPU, the available microcode, the setting of the + CONFIG_RETPOLINE configuration option, and the + compiler with which the kernel was built. + + Specific mitigations can also be selected manually: + + retpoline - replace indirect branches + retpoline,generic - google's original retpoline + retpoline,amd - AMD-specific minimal thunk + + Not specifying this option is equivalent to + spectre_v2=auto. + spia_io_base= [HW,MTD] spia_fio_base= spia_pedr= diff --git a/Documentation/x86/pti.txt b/Documentation/x86/pti.txt new file mode 100644 index 000000000000..d11eff61fc9a --- /dev/null +++ b/Documentation/x86/pti.txt @@ -0,0 +1,186 @@ +Overview +======== + +Page Table Isolation (pti, previously known as KAISER[1]) is a +countermeasure against attacks on the shared user/kernel address +space such as the "Meltdown" approach[2]. + +To mitigate this class of attacks, we create an independent set of +page tables for use only when running userspace applications. When +the kernel is entered via syscalls, interrupts or exceptions, the +page tables are switched to the full "kernel" copy. When the system +switches back to user mode, the user copy is used again. + +The userspace page tables contain only a minimal amount of kernel +data: only what is needed to enter/exit the kernel such as the +entry/exit functions themselves and the interrupt descriptor table +(IDT). There are a few strictly unnecessary things that get mapped +such as the first C function when entering an interrupt (see +comments in pti.c). + +This approach helps to ensure that side-channel attacks leveraging +the paging structures do not function when PTI is enabled. It can be +enabled by setting CONFIG_PAGE_TABLE_ISOLATION=y at compile time. +Once enabled at compile-time, it can be disabled at boot with the +'nopti' or 'pti=' kernel parameters (see kernel-parameters.txt). + +Page Table Management +===================== + +When PTI is enabled, the kernel manages two sets of page tables. +The first set is very similar to the single set which is present in +kernels without PTI. This includes a complete mapping of userspace +that the kernel can use for things like copy_to_user(). + +Although _complete_, the user portion of the kernel page tables is +crippled by setting the NX bit in the top level. This ensures +that any missed kernel->user CR3 switch will immediately crash +userspace upon executing its first instruction. + +The userspace page tables map only the kernel data needed to enter +and exit the kernel. This data is entirely contained in the 'struct +cpu_entry_area' structure which is placed in the fixmap which gives +each CPU's copy of the area a compile-time-fixed virtual address. + +For new userspace mappings, the kernel makes the entries in its +page tables like normal. The only difference is when the kernel +makes entries in the top (PGD) level. In addition to setting the +entry in the main kernel PGD, a copy of the entry is made in the +userspace page tables' PGD. + +This sharing at the PGD level also inherently shares all the lower +layers of the page tables. This leaves a single, shared set of +userspace page tables to manage. One PTE to lock, one set of +accessed bits, dirty bits, etc... + +Overhead +======== + +Protection against side-channel attacks is important. But, +this protection comes at a cost: + +1. Increased Memory Use + a. Each process now needs an order-1 PGD instead of order-0. + (Consumes an additional 4k per process). + b. The 'cpu_entry_area' structure must be 2MB in size and 2MB + aligned so that it can be mapped by setting a single PMD + entry. This consumes nearly 2MB of RAM once the kernel + is decompressed, but no space in the kernel image itself. + +2. Runtime Cost + a. CR3 manipulation to switch between the page table copies + must be done at interrupt, syscall, and exception entry + and exit (it can be skipped when the kernel is interrupted, + though.) Moves to CR3 are on the order of a hundred + cycles, and are required at every entry and exit. + b. A "trampoline" must be used for SYSCALL entry. This + trampoline depends on a smaller set of resources than the + non-PTI SYSCALL entry code, so requires mapping fewer + things into the userspace page tables. The downside is + that stacks must be switched at entry time. + d. Global pages are disabled for all kernel structures not + mapped into both kernel and userspace page tables. This + feature of the MMU allows different processes to share TLB + entries mapping the kernel. Losing the feature means more + TLB misses after a context switch. The actual loss of + performance is very small, however, never exceeding 1%. + d. Process Context IDentifiers (PCID) is a CPU feature that + allows us to skip flushing the entire TLB when switching page + tables by setting a special bit in CR3 when the page tables + are changed. This makes switching the page tables (at context + switch, or kernel entry/exit) cheaper. But, on systems with + PCID support, the context switch code must flush both the user + and kernel entries out of the TLB. The user PCID TLB flush is + deferred until the exit to userspace, minimizing the cost. + See intel.com/sdm for the gory PCID/INVPCID details. + e. The userspace page tables must be populated for each new + process. Even without PTI, the shared kernel mappings + are created by copying top-level (PGD) entries into each + new process. But, with PTI, there are now *two* kernel + mappings: one in the kernel page tables that maps everything + and one for the entry/exit structures. At fork(), we need to + copy both. + f. In addition to the fork()-time copying, there must also + be an update to the userspace PGD any time a set_pgd() is done + on a PGD used to map userspace. This ensures that the kernel + and userspace copies always map the same userspace + memory. + g. On systems without PCID support, each CR3 write flushes + the entire TLB. That means that each syscall, interrupt + or exception flushes the TLB. + h. INVPCID is a TLB-flushing instruction which allows flushing + of TLB entries for non-current PCIDs. Some systems support + PCIDs, but do not support INVPCID. On these systems, addresses + can only be flushed from the TLB for the current PCID. When + flushing a kernel address, we need to flush all PCIDs, so a + single kernel address flush will require a TLB-flushing CR3 + write upon the next use of every PCID. + +Possible Future Work +==================== +1. We can be more careful about not actually writing to CR3 + unless its value is actually changed. +2. Allow PTI to be enabled/disabled at runtime in addition to the + boot-time switching. + +Testing +======== + +To test stability of PTI, the following test procedure is recommended, +ideally doing all of these in parallel: + +1. Set CONFIG_DEBUG_ENTRY=y +2. Run several copies of all of the tools/testing/selftests/x86/ tests + (excluding MPX and protection_keys) in a loop on multiple CPUs for + several minutes. These tests frequently uncover corner cases in the + kernel entry code. In general, old kernels might cause these tests + themselves to crash, but they should never crash the kernel. +3. Run the 'perf' tool in a mode (top or record) that generates many + frequent performance monitoring non-maskable interrupts (see "NMI" + in /proc/interrupts). This exercises the NMI entry/exit code which + is known to trigger bugs in code paths that did not expect to be + interrupted, including nested NMIs. Using "-c" boosts the rate of + NMIs, and using two -c with separate counters encourages nested NMIs + and less deterministic behavior. + + while true; do perf record -c 10000 -e instructions,cycles -a sleep 10; done + +4. Launch a KVM virtual machine. +5. Run 32-bit binaries on systems supporting the SYSCALL instruction. + This has been a lightly-tested code path and needs extra scrutiny. + +Debugging +========= + +Bugs in PTI cause a few different signatures of crashes +that are worth noting here. + + * Failures of the selftests/x86 code. Usually a bug in one of the + more obscure corners of entry_64.S + * Crashes in early boot, especially around CPU bringup. Bugs + in the trampoline code or mappings cause these. + * Crashes at the first interrupt. Caused by bugs in entry_64.S, + like screwing up a page table switch. Also caused by + incorrectly mapping the IRQ handler entry code. + * Crashes at the first NMI. The NMI code is separate from main + interrupt handlers and can have bugs that do not affect + normal interrupts. Also caused by incorrectly mapping NMI + code. NMIs that interrupt the entry code must be very + careful and can be the cause of crashes that show up when + running perf. + * Kernel crashes at the first exit to userspace. entry_64.S + bugs, or failing to map some of the exit code. + * Crashes at first interrupt that interrupts userspace. The paths + in entry_64.S that return to userspace are sometimes separate + from the ones that return to the kernel. + * Double faults: overflowing the kernel stack because of page + faults upon page faults. Caused by touching non-pti-mapped + data in the entry code, or forgetting to switch to kernel + CR3 before calling into C functions which are not pti-mapped. + * Userspace segfaults early in boot, sometimes manifesting + as mount(8) failing to mount the rootfs. These have + tended to be TLB invalidation issues. Usually invalidating + the wrong PCID, or otherwise missing an invalidation. + +1. https://gruss.cc/files/kaiser.pdf +2. https://meltdownattack.com/meltdown.pdf diff --git a/Makefile b/Makefile index 2637f0ed0a07..aba553531d6a 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ VERSION = 4 PATCHLEVEL = 9 -SUBLEVEL = 76 +SUBLEVEL = 77 EXTRAVERSION = NAME = Roaring Lionus diff --git a/arch/arm/kvm/mmio.c b/arch/arm/kvm/mmio.c index b6e715fd3c90..dac7ceb1a677 100644 --- a/arch/arm/kvm/mmio.c +++ b/arch/arm/kvm/mmio.c @@ -112,7 +112,7 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct kvm_run *run) } trace_kvm_mmio(KVM_TRACE_MMIO_READ, len, run->mmio.phys_addr, - data); + &data); data = vcpu_data_host_to_guest(vcpu, data, len); vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data); } @@ -182,14 +182,14 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run, data = vcpu_data_guest_to_host(vcpu, vcpu_get_reg(vcpu, rt), len); - trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, data); + trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, &data); kvm_mmio_write_buf(data_buf, len, data); ret = kvm_io_bus_write(vcpu, KVM_MMIO_BUS, fault_ipa, len, data_buf); } else { trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, len, - fault_ipa, 0); + fault_ipa, NULL); ret = kvm_io_bus_read(vcpu, KVM_MMIO_BUS, fault_ipa, len, data_buf); diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c index c558bce989cd..6e716a5f1173 100644 --- a/arch/mips/kernel/process.c +++ b/arch/mips/kernel/process.c @@ -683,6 +683,18 @@ int mips_set_process_fp_mode(struct task_struct *task, unsigned int value) struct task_struct *t; int max_users; + /* If nothing to change, return right away, successfully. */ + if (value == mips_get_process_fp_mode(task)) + return 0; + + /* Only accept a mode change if 64-bit FP enabled for o32. */ + if (!IS_ENABLED(CONFIG_MIPS_O32_FP64_SUPPORT)) + return -EOPNOTSUPP; + + /* And only for o32 tasks. */ + if (IS_ENABLED(CONFIG_64BIT) && !test_thread_flag(TIF_32BIT_REGS)) + return -EOPNOTSUPP; + /* Check the value is valid */ if (value & ~known_bits) return -EOPNOTSUPP; diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c index 11890e6e4093..0c8ae2cc6380 100644 --- a/arch/mips/kernel/ptrace.c +++ b/arch/mips/kernel/ptrace.c @@ -439,63 +439,160 @@ static int gpr64_set(struct task_struct *target, #endif /* CONFIG_64BIT */ +/* + * Copy the floating-point context to the supplied NT_PRFPREG buffer, + * !CONFIG_CPU_HAS_MSA variant. FP context's general register slots + * correspond 1:1 to buffer slots. Only general registers are copied. + */ +static int fpr_get_fpa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + void **kbuf, void __user **ubuf) +{ + return user_regset_copyout(pos, count, kbuf, ubuf, + &target->thread.fpu, + 0, NUM_FPU_REGS * sizeof(elf_fpreg_t)); +} + +/* + * Copy the floating-point context to the supplied NT_PRFPREG buffer, + * CONFIG_CPU_HAS_MSA variant. Only lower 64 bits of FP context's + * general register slots are copied to buffer slots. Only general + * registers are copied. + */ +static int fpr_get_msa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + void **kbuf, void __user **ubuf) +{ + unsigned int i; + u64 fpr_val; + int err; + + BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); + for (i = 0; i < NUM_FPU_REGS; i++) { + fpr_val = get_fpr64(&target->thread.fpu.fpr[i], 0); + err = user_regset_copyout(pos, count, kbuf, ubuf, + &fpr_val, i * sizeof(elf_fpreg_t), + (i + 1) * sizeof(elf_fpreg_t)); + if (err) + return err; + } + + return 0; +} + +/* + * Copy the floating-point context to the supplied NT_PRFPREG buffer. + * Choose the appropriate helper for general registers, and then copy + * the FCSR register separately. + */ static int fpr_get(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, void *kbuf, void __user *ubuf) { - unsigned i; + const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t); int err; - u64 fpr_val; - /* XXX fcr31 */ + if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t)) + err = fpr_get_fpa(target, &pos, &count, &kbuf, &ubuf); + else + err = fpr_get_msa(target, &pos, &count, &kbuf, &ubuf); + if (err) + return err; - if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t)) - return user_regset_copyout(&pos, &count, &kbuf, &ubuf, - &target->thread.fpu, - 0, sizeof(elf_fpregset_t)); + err = user_regset_copyout(&pos, &count, &kbuf, &ubuf, + &target->thread.fpu.fcr31, + fcr31_pos, fcr31_pos + sizeof(u32)); - for (i = 0; i < NUM_FPU_REGS; i++) { - fpr_val = get_fpr64(&target->thread.fpu.fpr[i], 0); - err = user_regset_copyout(&pos, &count, &kbuf, &ubuf, - &fpr_val, i * sizeof(elf_fpreg_t), - (i + 1) * sizeof(elf_fpreg_t)); + return err; +} + +/* + * Copy the supplied NT_PRFPREG buffer to the floating-point context, + * !CONFIG_CPU_HAS_MSA variant. Buffer slots correspond 1:1 to FP + * context's general register slots. Only general registers are copied. + */ +static int fpr_set_fpa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + const void **kbuf, const void __user **ubuf) +{ + return user_regset_copyin(pos, count, kbuf, ubuf, + &target->thread.fpu, + 0, NUM_FPU_REGS * sizeof(elf_fpreg_t)); +} + +/* + * Copy the supplied NT_PRFPREG buffer to the floating-point context, + * CONFIG_CPU_HAS_MSA variant. Buffer slots are copied to lower 64 + * bits only of FP context's general register slots. Only general + * registers are copied. + */ +static int fpr_set_msa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + const void **kbuf, const void __user **ubuf) +{ + unsigned int i; + u64 fpr_val; + int err; + + BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); + for (i = 0; i < NUM_FPU_REGS && *count > 0; i++) { + err = user_regset_copyin(pos, count, kbuf, ubuf, + &fpr_val, i * sizeof(elf_fpreg_t), + (i + 1) * sizeof(elf_fpreg_t)); if (err) return err; + set_fpr64(&target->thread.fpu.fpr[i], 0, fpr_val); } return 0; } +/* + * Copy the supplied NT_PRFPREG buffer to the floating-point context. + * Choose the appropriate helper for general registers, and then copy + * the FCSR register separately. + * + * We optimize for the case where `count % sizeof(elf_fpreg_t) == 0', + * which is supposed to have been guaranteed by the kernel before + * calling us, e.g. in `ptrace_regset'. We enforce that requirement, + * so that we can safely avoid preinitializing temporaries for + * partial register writes. + */ static int fpr_set(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, const void *kbuf, const void __user *ubuf) { - unsigned i; + const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t); + u32 fcr31; int err; - u64 fpr_val; - /* XXX fcr31 */ + BUG_ON(count % sizeof(elf_fpreg_t)); + + if (pos + count > sizeof(elf_fpregset_t)) + return -EIO; init_fp_ctx(target); - if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t)) - return user_regset_copyin(&pos, &count, &kbuf, &ubuf, - &target->thread.fpu, - 0, sizeof(elf_fpregset_t)); + if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t)) + err = fpr_set_fpa(target, &pos, &count, &kbuf, &ubuf); + else + err = fpr_set_msa(target, &pos, &count, &kbuf, &ubuf); + if (err) + return err; - BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); - for (i = 0; i < NUM_FPU_REGS && count >= sizeof(elf_fpreg_t); i++) { + if (count > 0) { err = user_regset_copyin(&pos, &count, &kbuf, &ubuf, - &fpr_val, i * sizeof(elf_fpreg_t), - (i + 1) * sizeof(elf_fpreg_t)); + &fcr31, + fcr31_pos, fcr31_pos + sizeof(u32)); if (err) return err; - set_fpr64(&target->thread.fpu.fpr[i], 0, fpr_val); + + ptrace_setfcr31(target, fcr31); } - return 0; + return err; } enum mips_regset { diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index da8156fd3d58..0ca4d12ce95c 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -64,6 +64,7 @@ config X86 select GENERIC_CLOCKEVENTS_MIN_ADJUST select GENERIC_CMOS_UPDATE select GENERIC_CPU_AUTOPROBE + select GENERIC_CPU_VULNERABILITIES select GENERIC_EARLY_IOREMAP select GENERIC_FIND_FIRST_BIT select GENERIC_IOMAP @@ -407,6 +408,19 @@ config GOLDFISH def_bool y depends on X86_GOLDFISH +config RETPOLINE + bool "Avoid speculative indirect branches in kernel" + default y + ---help--- + Compile kernel with the retpoline compiler options to guard against + kernel-to-user data leaks by avoiding speculative indirect + branches. Requires a compiler with -mindirect-branch=thunk-extern + support for full protection. The kernel may run slower. + + Without compiler support, at least indirect branches in assembler + code are eliminated. Since this includes the syscall entry path, + it is not entirely pointless. + if X86_32 config X86_EXTENDED_PLATFORM bool "Support for extended (non-PC) x86 platforms" diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 2d449337a360..cd22cb8ebd42 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -182,6 +182,14 @@ KBUILD_CFLAGS += -fno-asynchronous-unwind-tables KBUILD_CFLAGS += $(mflags-y) KBUILD_AFLAGS += $(mflags-y) +# Avoid indirect branches in kernel to deal with Spectre +ifdef CONFIG_RETPOLINE + RETPOLINE_CFLAGS += $(call cc-option,-mindirect-branch=thunk-extern -mindirect-branch-register) + ifneq ($(RETPOLINE_CFLAGS),) + KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE + endif +endif + archscripts: scripts_basic $(Q)$(MAKE) $(build)=arch/x86/tools relocs diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S index 383a6f84a060..fa8801b35e51 100644 --- a/arch/x86/crypto/aesni-intel_asm.S +++ b/arch/x86/crypto/aesni-intel_asm.S @@ -32,6 +32,7 @@ #include <linux/linkage.h> #include <asm/inst.h> #include <asm/frame.h> +#include <asm/nospec-branch.h> /* * The following macros are used to move an (un)aligned 16 byte value to/from @@ -2734,7 +2735,7 @@ ENTRY(aesni_xts_crypt8) pxor INC, STATE4 movdqu IV, 0x30(OUTP) - call *%r11 + CALL_NOSPEC %r11 movdqu 0x00(OUTP), INC pxor INC, STATE1 @@ -2779,7 +2780,7 @@ ENTRY(aesni_xts_crypt8) _aesni_gf128mul_x_ble() movups IV, (IVP) - call *%r11 + CALL_NOSPEC %r11 movdqu 0x40(OUTP), INC pxor INC, STATE1 diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S index aa9e8bd163f6..77ff4de2224d 100644 --- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S +++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S @@ -17,6 +17,7 @@ #include <linux/linkage.h> #include <asm/frame.h> +#include <asm/nospec-branch.h> #define CAMELLIA_TABLE_BYTE_LEN 272 @@ -1224,7 +1225,7 @@ camellia_xts_crypt_16way: vpxor 14 * 16(%rax), %xmm15, %xmm14; vpxor 15 * 16(%rax), %xmm15, %xmm15; - call *%r9; + CALL_NOSPEC %r9; addq $(16 * 16), %rsp; diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S index 16186c18656d..7384342fbb41 100644 --- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S +++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S @@ -12,6 +12,7 @@ #include <linux/linkage.h> #include <asm/frame.h> +#include <asm/nospec-branch.h> #define CAMELLIA_TABLE_BYTE_LEN 272 @@ -1337,7 +1338,7 @@ camellia_xts_crypt_32way: vpxor 14 * 32(%rax), %ymm15, %ymm14; vpxor 15 * 32(%rax), %ymm15, %ymm15; - call *%r9; + CALL_NOSPEC %r9; addq $(16 * 32), %rsp; diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S index dc05f010ca9b..174fd4146043 100644 --- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S +++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S @@ -45,6 +45,7 @@ #include <asm/inst.h> #include <linux/linkage.h> +#include <asm/nospec-branch.h> ## ISCSI CRC 32 Implementation with crc32 and pclmulqdq Instruction @@ -172,7 +173,7 @@ continue_block: movzxw (bufp, %rax, 2), len lea crc_array(%rip), bufp lea (bufp, len, 1), bufp - jmp *bufp + JMP_NOSPEC bufp ################################################################ ## 2a) PROCESS FULL BLOCKS: diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index edba8606b99a..bdc9aeaf2e45 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -45,6 +45,7 @@ #include <asm/asm.h> #include <asm/smap.h> #include <asm/export.h> +#include <asm/nospec-branch.h> .section .entry.text, "ax" @@ -260,7 +261,7 @@ ENTRY(ret_from_fork) /* kernel thread */ 1: movl %edi, %eax - call *%ebx + CALL_NOSPEC %ebx /* * A kernel thread is allowed to return here after successfully * calling do_execve(). Exit to userspace to complete the execve() @@ -984,7 +985,8 @@ trace: movl 0x4(%ebp), %edx subl $MCOUNT_INSN_SIZE, %eax - call *ftrace_trace_function + movl ftrace_trace_function, %ecx + CALL_NOSPEC %ecx popl %edx popl %ecx @@ -1020,7 +1022,7 @@ return_to_handler: movl %eax, %ecx popl %edx popl %eax - jmp *%ecx + JMP_NOSPEC %ecx #endif #ifdef CONFIG_TRACING @@ -1062,7 +1064,7 @@ error_code: movl %ecx, %es TRACE_IRQS_OFF movl %esp, %eax # pt_regs pointer - call *%edi + CALL_NOSPEC %edi jmp ret_from_exception END(page_fault) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index af4e58132d91..b9c901ce6582 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -37,6 +37,7 @@ #include <asm/pgtable_types.h> #include <asm/export.h> #include <asm/kaiser.h> +#include <asm/nospec-branch.h> #include <linux/err.h> /* Avoid __ASSEMBLER__'ifying <linux/audit.h> just for this. */ @@ -208,7 +209,12 @@ entry_SYSCALL_64_fastpath: * It might end up jumping to the slow path. If it jumps, RAX * and all argument registers are clobbered. */ +#ifdef CONFIG_RETPOLINE + movq sys_call_table(, %rax, 8), %rax + call __x86_indirect_thunk_rax +#else call *sys_call_table(, %rax, 8) +#endif .Lentry_SYSCALL_64_after_fastpath_call: movq %rax, RAX(%rsp) @@ -380,7 +386,7 @@ ENTRY(stub_ptregs_64) jmp entry_SYSCALL64_slow_path 1: - jmp *%rax /* Called from C */ + JMP_NOSPEC %rax /* Called from C */ END(stub_ptregs_64) .macro ptregs_stub func @@ -457,7 +463,7 @@ ENTRY(ret_from_fork) 1: /* kernel thread */ movq %r12, %rdi - call *%rbx + CALL_NOSPEC %rbx /* * A kernel thread is allowed to return here after successfully * calling do_execve(). Exit to userspace to complete the execve() diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index d4aea31eec03..deca9b9c7923 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -139,7 +139,7 @@ static inline int alternatives_text_reserved(void *start, void *end) ".popsection\n" \ ".pushsection .altinstr_replacement, \"ax\"\n" \ ALTINSTR_REPLACEMENT(newinstr, feature, 1) \ - ".popsection" + ".popsection\n" #define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2)\ OLDINSTR_2(oldinstr, 1, 2) \ @@ -150,7 +150,7 @@ static inline int alternatives_text_reserved(void *start, void *end) ".pushsection .altinstr_replacement, \"ax\"\n" \ ALTINSTR_REPLACEMENT(newinstr1, feature1, 1) \ ALTINSTR_REPLACEMENT(newinstr2, feature2, 2) \ - ".popsection" + ".popsection\n" /* * Alternative instructions for different CPU types or capabilities. diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h index 44b8762fa0c7..b15aa4083dfd 100644 --- a/arch/x86/include/asm/asm-prototypes.h +++ b/arch/x86/include/asm/asm-prototypes.h @@ -10,7 +10,32 @@ #include <asm/pgtable.h> #include <asm/special_insns.h> #include <asm/preempt.h> +#include <asm/asm.h> #ifndef CONFIG_X86_CMPXCHG64 extern void cmpxchg8b_emu(void); #endif + +#ifdef CONFIG_RETPOLINE +#ifdef CONFIG_X86_32 +#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_e ## reg(void); +#else +#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_r ## reg(void); +INDIRECT_THUNK(8) +INDIRECT_THUNK(9) +INDIRECT_THUNK(10) +INDIRECT_THUNK(11) +INDIRECT_THUNK(12) +INDIRECT_THUNK(13) +INDIRECT_THUNK(14) +INDIRECT_THUNK(15) +#endif +INDIRECT_THUNK(ax) +INDIRECT_THUNK(bx) +INDIRECT_THUNK(cx) +INDIRECT_THUNK(dx) +INDIRECT_THUNK(si) +INDIRECT_THUNK(di) +INDIRECT_THUNK(bp) +INDIRECT_THUNK(sp) +#endif /* CONFIG_RETPOLINE */ diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h index 7acb51c49fec..00523524edbf 100644 --- a/arch/x86/include/asm/asm.h +++ b/arch/x86/include/asm/asm.h @@ -125,4 +125,15 @@ /* For C file, we already have NOKPROBE_SYMBOL macro */ #endif +#ifndef __ASSEMBLY__ +/* + * This output constraint should be used for any inline asm which has a "call" + * instruction. Otherwise the asm may be inserted before the frame pointer + * gets set up by the containing function. If you forget to do this, objtool + * may print a "call without frame pointer save/setup" warning. + */ +register unsigned long current_stack_pointer asm(_ASM_SP); +#define ASM_CALL_CONSTRAINT "+r" (current_stack_pointer) +#endif + #endif /* _ASM_X86_ASM_H */ diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 1d2b69fc0ceb..9ea67a04ff4f 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -135,6 +135,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; set_bit(bit, (unsigned long *)cpu_caps_set); \ } while (0) +#define setup_force_cpu_bug(bit) setup_force_cpu_cap(bit) + #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_X86_FAST_FEATURE_TESTS) /* * Static testing of CPU features. Used the same as boot_cpu_has(). diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 454a37adb823..4467568a531b 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -194,6 +194,9 @@ #define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */ #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */ +#define X86_FEATURE_RETPOLINE ( 7*32+12) /* Generic Retpoline mitigation for Spectre variant 2 */ +#define X86_FEATURE_RETPOLINE_AMD ( 7*32+13) /* AMD Retpoline mitigation for Spectre variant 2 */ + #define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */ #define X86_FEATURE_AVX512_4VNNIW (7*32+16) /* AVX-512 Neural Network Instructions */ #define X86_FEATURE_AVX512_4FMAPS (7*32+17) /* AVX-512 Multiply Accumulation Single precision */ @@ -316,5 +319,8 @@ #define X86_BUG_SWAPGS_FENCE X86_BUG(11) /* SWAPGS without input dep on GS */ #define X86_BUG_MONITOR X86_BUG(12) /* IPI required to wake up remote CPU */ #define X86_BUG_AMD_E400 X86_BUG(13) /* CPU is among the affected by Erratum 400 */ +#define X86_BUG_CPU_MELTDOWN X86_BUG(14) /* CPU is affected by meltdown attack and needs kernel page table isolation */ +#define X86_BUG_SPECTRE_V1 X86_BUG(15) /* CPU is affected by Spectre variant 1 attack with conditional branches */ +#define X86_BUG_SPECTRE_V2 X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */ #endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index b601ddac5719..b11c4c072df8 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -330,6 +330,9 @@ #define FAM10H_MMIO_CONF_BASE_MASK 0xfffffffULL #define FAM10H_MMIO_CONF_BASE_SHIFT 20 #define MSR_FAM10H_NODE_ID 0xc001100c +#define MSR_F10H_DECFG 0xc0011029 +#define MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT 1 +#define MSR_F10H_DECFG_LFENCE_SERIALIZE BIT_ULL(MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT) /* K8 MSRs */ #define MSR_K8_TOP_MEM1 0xc001001a diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h new file mode 100644 index 000000000000..402a11c803c3 --- /dev/null +++ b/arch/x86/include/asm/nospec-branch.h @@ -0,0 +1,214 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __NOSPEC_BRANCH_H__ +#define __NOSPEC_BRANCH_H__ + +#include <asm/alternative.h> +#include <asm/alternative-asm.h> +#include <asm/cpufeatures.h> + +/* + * Fill the CPU return stack buffer. + * + * Each entry in the RSB, if used for a speculative 'ret', contains an + * infinite 'pause; jmp' loop to capture speculative execution. + * + * This is required in various cases for retpoline and IBRS-based + * mitigations for the Spectre variant 2 vulnerability. Sometimes to + * eliminate potentially bogus entries from the RSB, and sometimes + * purely to ensure that it doesn't get empty, which on some CPUs would + * allow predictions from other (unwanted!) sources to be used. + * + * We define a CPP macro such that it can be used from both .S files and + * inline assembly. It's possible to do a .macro and then include that + * from C via asm(".include <asm/nospec-branch.h>") but let's not go there. + */ + +#define RSB_CLEAR_LOOPS 32 /* To forcibly overwrite all entries */ +#define RSB_FILL_LOOPS 16 /* To avoid underflow */ + +/* + * Google experimented with loop-unrolling and this turned out to be + * the optimal version — two calls, each with their own speculation + * trap should their return address end up getting used, in a loop. + */ +#define __FILL_RETURN_BUFFER(reg, nr, sp) \ + mov $(nr/2), reg; \ +771: \ + call 772f; \ +773: /* speculation trap */ \ + pause; \ + jmp 773b; \ +772: \ + call 774f; \ +775: /* speculation trap */ \ + pause; \ + jmp 775b; \ +774: \ + dec reg; \ + jnz 771b; \ + add $(BITS_PER_LONG/8) * nr, sp; + +#ifdef __ASSEMBLY__ + +/* + * This should be used immediately before a retpoline alternative. It tells + * objtool where the retpolines are so that it can make sense of the control + * flow by just reading the original instruction(s) and ignoring the + * alternatives. + */ +.macro ANNOTATE_NOSPEC_ALTERNATIVE + .Lannotate_\@: + .pushsection .discard.nospec + .long .Lannotate_\@ - . + .popsection +.endm + +/* + * These are the bare retpoline primitives for indirect jmp and call. + * Do not use these directly; they only exist to make the ALTERNATIVE + * invocation below less ugly. + */ +.macro RETPOLINE_JMP reg:req + call .Ldo_rop_\@ +.Lspec_trap_\@: + pause + jmp .Lspec_trap_\@ +.Ldo_rop_\@: + mov \reg, (%_ASM_SP) + ret +.endm + +/* + * This is a wrapper around RETPOLINE_JMP so the called function in reg + * returns to the instruction after the macro. + */ +.macro RETPOLINE_CALL reg:req + jmp .Ldo_call_\@ +.Ldo_retpoline_jmp_\@: + RETPOLINE_JMP \reg +.Ldo_call_\@: + call .Ldo_retpoline_jmp_\@ +.endm + +/* + * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple + * indirect jmp/call which may be susceptible to the Spectre variant 2 + * attack. + */ +.macro JMP_NOSPEC reg:req +#ifdef CONFIG_RETPOLINE + ANNOTATE_NOSPEC_ALTERNATIVE + ALTERNATIVE_2 __stringify(jmp *\reg), \ + __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \ + __stringify(lfence; jmp *\reg), X86_FEATURE_RETPOLINE_AMD +#else + jmp *\reg +#endif +.endm + +.macro CALL_NOSPEC reg:req +#ifdef CONFIG_RETPOLINE + ANNOTATE_NOSPEC_ALTERNATIVE + ALTERNATIVE_2 __stringify(call *\reg), \ + __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\ + __stringify(lfence; call *\reg), X86_FEATURE_RETPOLINE_AMD +#else + call *\reg +#endif +.endm + + /* + * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP + * monstrosity above, manually. + */ +.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req +#ifdef CONFIG_RETPOLINE + ANNOTATE_NOSPEC_ALTERNATIVE + ALTERNATIVE "jmp .Lskip_rsb_\@", \ + __stringify(__FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)) \ + \ftr +.Lskip_rsb_\@: +#endif +.endm + +#else /* __ASSEMBLY__ */ + +#define ANNOTATE_NOSPEC_ALTERNATIVE \ + "999:\n\t" \ + ".pushsection .discard.nospec\n\t" \ + ".long 999b - .\n\t" \ + ".popsection\n\t" + +#if defined(CONFIG_X86_64) && defined(RETPOLINE) + +/* + * Since the inline asm uses the %V modifier which is only in newer GCC, + * the 64-bit one is dependent on RETPOLINE not CONFIG_RETPOLINE. + */ +# define CALL_NOSPEC \ + ANNOTATE_NOSPEC_ALTERNATIVE \ + ALTERNATIVE( \ + "call *%[thunk_target]\n", \ + "call __x86_indirect_thunk_%V[thunk_target]\n", \ + X86_FEATURE_RETPOLINE) +# define THUNK_TARGET(addr) [thunk_target] "r" (addr) + +#elif defined(CONFIG_X86_32) && defined(CONFIG_RETPOLINE) +/* + * For i386 we use the original ret-equivalent retpoline, because + * otherwise we'll run out of registers. We don't care about CET + * here, anyway. + */ +# define CALL_NOSPEC ALTERNATIVE("call *%[thunk_target]\n", \ + " jmp 904f;\n" \ + " .align 16\n" \ + "901: call 903f;\n" \ + "902: pause;\n" \ + " jmp 902b;\n" \ + " .align 16\n" \ + "903: addl $4, %%esp;\n" \ + " pushl %[thunk_target];\n" \ + " ret;\n" \ + " .align 16\n" \ + "904: call 901b;\n", \ + X86_FEATURE_RETPOLINE) + +# define THUNK_TARGET(addr) [thunk_target] "rm" (addr) +#else /* No retpoline for C / inline asm */ +# define CALL_NOSPEC "call *%[thunk_target]\n" +# define THUNK_TARGET(addr) [thunk_target] "rm" (addr) +#endif + +/* The Spectre V2 mitigation variants */ +enum spectre_v2_mitigation { + SPECTRE_V2_NONE, + SPECTRE_V2_RETPOLINE_MINIMAL, + SPECTRE_V2_RETPOLINE_MINIMAL_AMD, + SPECTRE_V2_RETPOLINE_GENERIC, + SPECTRE_V2_RETPOLINE_AMD, + SPECTRE_V2_IBRS, +}; + +/* + * On VMEXIT we must ensure that no RSB predictions learned in the guest + * can be followed in the host, by overwriting the RSB completely. Both + * retpoline and IBRS mitigations for Spectre v2 need this; only on future + * CPUs with IBRS_ATT *might* it be avoided. + */ +static inline void vmexit_fill_RSB(void) +{ +#ifdef CONFIG_RETPOLINE + unsigned long loops = RSB_CLEAR_LOOPS / 2; + + asm volatile (ANNOTATE_NOSPEC_ALTERNATIVE + ALTERNATIVE("jmp 910f", + __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)), + X86_FEATURE_RETPOLINE) + "910:" + : "=&r" (loops), ASM_CALL_CONSTRAINT + : "r" (loops) : "memory" ); +#endif +} +#endif /* __ASSEMBLY__ */ +#endif /* __NOSPEC_BRANCH_H__ */ diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h index b6d425999f99..1178a51b77f3 100644 --- a/arch/x86/include/asm/pgalloc.h +++ b/arch/x86/include/asm/pgalloc.h @@ -27,6 +27,17 @@ static inline void paravirt_release_pud(unsigned long pfn) {} */ extern gfp_t __userpte_alloc_gfp; +#ifdef CONFIG_PAGE_TABLE_ISOLATION +/* + * Instead of one PGD, we acquire two PGDs. Being order-1, it is + * both 8k in size and 8k-aligned. That lets us just flip bit 12 + * in a pointer to swap between the two 4k halves. + */ +#define PGD_ALLOCATION_ORDER 1 +#else +#define PGD_ALLOCATION_ORDER 0 +#endif + /* * Allocate and free page tables. */ diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 8cb52ee3ade6..e40b19ca486e 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -156,8 +156,8 @@ extern struct cpuinfo_x86 boot_cpu_data; extern struct cpuinfo_x86 new_cpu_data; extern struct tss_struct doublefault_tss; -extern __u32 cpu_caps_cleared[NCAPINTS]; -extern __u32 cpu_caps_set[NCAPINTS]; +extern __u32 cpu_caps_cleared[NCAPINTS + NBUGINTS]; +extern __u32 cpu_caps_set[NCAPINTS + NBUGINTS]; #ifdef CONFIG_SMP DECLARE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info); diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index ad6f5eb07a95..bdf9c4c91572 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -152,17 +152,6 @@ struct thread_info { */ #ifndef __ASSEMBLY__ -static inline unsigned long current_stack_pointer(void) -{ - unsigned long sp; -#ifdef CONFIG_X86_64 - asm("mov %%rsp,%0" : "=g" (sp)); -#else - asm("mov %%esp,%0" : "=g" (sp)); -#endif - return sp; -} - /* * Walks up the stack frames to make sure that the specified object is * entirely contained by a single stack frame. diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h index 8b678af866f7..ccdc23d89b60 100644 --- a/arch/x86/include/asm/xen/hypercall.h +++ b/arch/x86/include/asm/xen/hypercall.h @@ -44,6 +44,7 @@ #include <asm/page.h> #include <asm/pgtable.h> #include <asm/smap.h> +#include <asm/nospec-branch.h> #include <xen/interface/xen.h> #include <xen/interface/sched.h> @@ -216,9 +217,9 @@ privcmd_call(unsigned call, __HYPERCALL_5ARG(a1, a2, a3, a4, a5); stac(); - asm volatile("call *%[call]" + asm volatile(CALL_NOSPEC : __HYPERCALL_5PARAM - : [call] "a" (&hypercall_page[call]) + : [thunk_target] "a" (&hypercall_page[call]) : __HYPERCALL_CLOBBER5); clac(); diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index 11cc600f4df0..0a1e8a67cc99 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -335,13 +335,12 @@ acpi_parse_lapic_nmi(struct acpi_subtable_header * header, const unsigned long e #ifdef CONFIG_X86_IO_APIC #define MP_ISA_BUS 0 +static int __init mp_register_ioapic_irq(u8 bus_irq, u8 polarity, + u8 trigger, u32 gsi); + static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, u32 gsi) { - int ioapic; - int pin; - struct mpc_intsrc mp_irq; - /* * Check bus_irq boundary. */ @@ -350,14 +349,6 @@ static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, return; } - /* - * Convert 'gsi' to 'ioapic.pin'. - */ - ioapic = mp_find_ioapic(gsi); - if (ioapic < 0) - return; - pin = mp_find_ioapic_pin(ioapic, gsi); - /* * TBD: This check is for faulty timer entries, where the override * erroneously sets the trigger to level, resulting in a HUGE @@ -366,16 +357,8 @@ static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, if ((bus_irq == 0) && (trigger == 3)) trigger = 1; - mp_irq.type = MP_INTSRC; - mp_irq.irqtype = mp_INT; - mp_irq.irqflag = (trigger << 2) | polarity; - mp_irq.srcbus = MP_ISA_BUS; - mp_irq.srcbusirq = bus_irq; /* IRQ */ - mp_irq.dstapic = mpc_ioapic_id(ioapic); /* APIC ID */ - mp_irq.dstirq = pin; /* INTIN# */ - - mp_save_irq(&mp_irq); - + if (mp_register_ioapic_irq(bus_irq, polarity, trigger, gsi) < 0) + return; /* * Reset default identity mapping if gsi is also an legacy IRQ, * otherwise there will be more than one entry with the same GSI @@ -422,6 +405,34 @@ static int mp_config_acpi_gsi(struct device *dev, u32 gsi, int trigger, return 0; } +static int __init mp_register_ioapic_irq(u8 bus_irq, u8 polarity, + u8 trigger, u32 gsi) +{ + struct mpc_intsrc mp_irq; + int ioapic, pin; + + /* Convert 'gsi' to 'ioapic.pin'(INTIN#) */ + ioapic = mp_find_ioapic(gsi); + if (ioapic < 0) { + pr_warn("Failed to find ioapic for gsi : %u\n", gsi); + return ioapic; + } + + pin = mp_find_ioapic_pin(ioapic, gsi); + + mp_irq.type = MP_INTSRC; + mp_irq.irqtype = mp_INT; + mp_irq.irqflag = (trigger << 2) | polarity; + mp_irq.srcbus = MP_ISA_BUS; + mp_irq.srcbusirq = bus_irq; + mp_irq.dstapic = mpc_ioapic_id(ioapic); + mp_irq.dstirq = pin; + + mp_save_irq(&mp_irq); + + return 0; +} + static int __init acpi_parse_ioapic(struct acpi_subtable_header * header, const unsigned long end) { @@ -466,7 +477,11 @@ static void __init acpi_sci_ioapic_setup(u8 bus_irq, u16 polarity, u16 trigger, if (acpi_sci_flags & ACPI_MADT_POLARITY_MASK) polarity = acpi_sci_flags & ACPI_MADT_POLARITY_MASK; - mp_override_legacy_irq(bus_irq, polarity, trigger, gsi); + if (bus_irq < NR_IRQS_LEGACY) + mp_override_legacy_irq(bus_irq, polarity, trigger, gsi); + else + mp_register_ioapic_irq(bus_irq, polarity, trigger, gsi); + acpi_penalize_sci_irq(bus_irq, trigger, polarity); /* diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 5cb272a7a5a3..10d5a3d6affc 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -340,9 +340,12 @@ recompute_jump(struct alt_instr *a, u8 *orig_insn, u8 *repl_insn, u8 *insnbuf) static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr) { unsigned long flags; + int i; - if (instr[0] != 0x90) - return; + for (i = 0; i < a->padlen; i++) { + if (instr[i] != 0x90) + return; + } local_irq_save(flags); add_nops(instr + (a->instrlen - a->padlen), a->padlen); diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index 4a8697f7d4ef..33b63670bf09 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -20,13 +20,11 @@ obj-y := intel_cacheinfo.o scattered.o topology.o obj-y += common.o obj-y += rdrand.o obj-y += match.o +obj-y += bugs.o obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_X86_FEATURE_NAMES) += capflags.o powerflags.o -obj-$(CONFIG_X86_32) += bugs.o -obj-$(CONFIG_X86_64) += bugs_64.o - obj-$(CONFIG_CPU_SUP_INTEL) += intel.o obj-$(CONFIG_CPU_SUP_AMD) += amd.o obj-$(CONFIG_CPU_SUP_CYRIX_32) += cyrix.o diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 2b4cf04239b6..1b89f0c4251e 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -782,8 +782,32 @@ static void init_amd(struct cpuinfo_x86 *c) set_cpu_cap(c, X86_FEATURE_K8); if (cpu_has(c, X86_FEATURE_XMM2)) { - /* MFENCE stops RDTSC speculation */ - set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC); + unsigned long long val; + int ret; + + /* + * A serializing LFENCE has less overhead than MFENCE, so + * use it for execution serialization. On families which + * don't have that MSR, LFENCE is already serializing. + * msr_set_bit() uses the safe accessors, too, even if the MSR + * is not present. + */ + msr_set_bit(MSR_F10H_DECFG, + MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT); + + /* + * Verify that the MSR write was successful (could be running + * under a hypervisor) and only then assume that LFENCE is + * serializing. + */ + ret = rdmsrl_safe(MSR_F10H_DECFG, &val); + if (!ret && (val & MSR_F10H_DECFG_LFENCE_SERIALIZE)) { + /* A serializing LFENCE stops RDTSC speculation */ + set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC); + } else { + /* MFENCE stops RDTSC speculation */ + set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC); + } } /* diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 0b6124315441..49d25ddf0e9f 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -9,6 +9,10 @@ */ #include <linux/init.h> #include <linux/utsname.h> +#include <linux/cpu.h> + +#include <asm/nospec-branch.h> +#include <asm/cmdline.h> #include <asm/bugs.h> #include <asm/processor.h> #include <asm/processor-flags.h> @@ -16,23 +20,24 @@ #include <asm/msr.h> #include <asm/paravirt.h> #include <asm/alternative.h> +#include <asm/pgtable.h> +#include <asm/cacheflush.h> + +static void __init spectre_v2_select_mitigation(void); void __init check_bugs(void) { -#ifdef CONFIG_X86_32 - /* - * Regardless of whether PCID is enumerated, the SDM says - * that it can't be enabled in 32-bit mode. - */ - setup_clear_cpu_cap(X86_FEATURE_PCID); -#endif - identify_boot_cpu(); -#ifndef CONFIG_SMP - pr_info("CPU: "); - print_cpu_info(&boot_cpu_data); -#endif + if (!IS_ENABLED(CONFIG_SMP)) { + pr_info("CPU: "); + print_cpu_info(&boot_cpu_data); + } + + /* Select the proper spectre mitigation before patching alternatives */ + spectre_v2_select_mitigation(); + +#ifdef CONFIG_X86_32 /* * Check whether we are able to run this kernel safely on SMP. * @@ -48,4 +53,194 @@ void __init check_bugs(void) alternative_instructions(); fpu__init_check_bugs(); +#else /* CONFIG_X86_64 */ + alternative_instructions(); + + /* + * Make sure the first 2MB area is not mapped by huge pages + * There are typically fixed size MTRRs in there and overlapping + * MTRRs into large pages causes slow downs. + * + * Right now we don't do that with gbpages because there seems + * very little benefit for that case. + */ + if (!direct_gbpages) + set_memory_4k((unsigned long)__va(0), 1); +#endif +} + +/* The kernel command line selection */ +enum spectre_v2_mitigation_cmd { + SPECTRE_V2_CMD_NONE, + SPECTRE_V2_CMD_AUTO, + SPECTRE_V2_CMD_FORCE, + SPECTRE_V2_CMD_RETPOLINE, + SPECTRE_V2_CMD_RETPOLINE_GENERIC, + SPECTRE_V2_CMD_RETPOLINE_AMD, +}; + +static const char *spectre_v2_strings[] = { + [SPECTRE_V2_NONE] = "Vulnerable", + [SPECTRE_V2_RETPOLINE_MINIMAL] = "Vulnerable: Minimal generic ASM retpoline", + [SPECTRE_V2_RETPOLINE_MINIMAL_AMD] = "Vulnerable: Minimal AMD ASM retpoline", + [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline", + [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline", +}; + +#undef pr_fmt +#define pr_fmt(fmt) "Spectre V2 mitigation: " fmt + +static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE; + +static void __init spec2_print_if_insecure(const char *reason) +{ + if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) + pr_info("%s\n", reason); +} + +static void __init spec2_print_if_secure(const char *reason) +{ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) + pr_info("%s\n", reason); +} + +static inline bool retp_compiler(void) +{ + return __is_defined(RETPOLINE); +} + +static inline bool match_option(const char *arg, int arglen, const char *opt) +{ + int len = strlen(opt); + + return len == arglen && !strncmp(arg, opt, len); +} + +static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) +{ + char arg[20]; + int ret; + + ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, + sizeof(arg)); + if (ret > 0) { + if (match_option(arg, ret, "off")) { + goto disable; + } else if (match_option(arg, ret, "on")) { + spec2_print_if_secure("force enabled on command line."); + return SPECTRE_V2_CMD_FORCE; + } else if (match_option(arg, ret, "retpoline")) { + spec2_print_if_insecure("retpoline selected on command line."); + return SPECTRE_V2_CMD_RETPOLINE; + } else if (match_option(arg, ret, "retpoline,amd")) { + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) { + pr_err("retpoline,amd selected but CPU is not AMD. Switching to AUTO select\n"); + return SPECTRE_V2_CMD_AUTO; + } + spec2_print_if_insecure("AMD retpoline selected on command line."); + return SPECTRE_V2_CMD_RETPOLINE_AMD; + } else if (match_option(arg, ret, "retpoline,generic")) { + spec2_print_if_insecure("generic retpoline selected on command line."); + return SPECTRE_V2_CMD_RETPOLINE_GENERIC; + } else if (match_option(arg, ret, "auto")) { + return SPECTRE_V2_CMD_AUTO; + } + } + + if (!cmdline_find_option_bool(boot_command_line, "nospectre_v2")) + return SPECTRE_V2_CMD_AUTO; +disable: + spec2_print_if_insecure("disabled on command line."); + return SPECTRE_V2_CMD_NONE; } + +static void __init spectre_v2_select_mitigation(void) +{ + enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline(); + enum spectre_v2_mitigation mode = SPECTRE_V2_NONE; + + /* + * If the CPU is not affected and the command line mode is NONE or AUTO + * then nothing to do. + */ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2) && + (cmd == SPECTRE_V2_CMD_NONE || cmd == SPECTRE_V2_CMD_AUTO)) + return; + + switch (cmd) { + case SPECTRE_V2_CMD_NONE: + return; + + case SPECTRE_V2_CMD_FORCE: + /* FALLTRHU */ + case SPECTRE_V2_CMD_AUTO: + goto retpoline_auto; + + case SPECTRE_V2_CMD_RETPOLINE_AMD: + if (IS_ENABLED(CONFIG_RETPOLINE)) + goto retpoline_amd; + break; + case SPECTRE_V2_CMD_RETPOLINE_GENERIC: + if (IS_ENABLED(CONFIG_RETPOLINE)) + goto retpoline_generic; + break; + case SPECTRE_V2_CMD_RETPOLINE: + if (IS_ENABLED(CONFIG_RETPOLINE)) + goto retpoline_auto; + break; + } + pr_err("kernel not compiled with retpoline; no mitigation available!"); + return; + +retpoline_auto: + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { + retpoline_amd: + if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) { + pr_err("LFENCE not serializing. Switching to generic retpoline\n"); + goto retpoline_generic; + } + mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_AMD : + SPECTRE_V2_RETPOLINE_MINIMAL_AMD; + setup_force_cpu_cap(X86_FEATURE_RETPOLINE_AMD); + setup_force_cpu_cap(X86_FEATURE_RETPOLINE); + } else { + retpoline_generic: + mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_GENERIC : + SPECTRE_V2_RETPOLINE_MINIMAL; + setup_force_cpu_cap(X86_FEATURE_RETPOLINE); + } + + spectre_v2_enabled = mode; + pr_info("%s\n", spectre_v2_strings[mode]); +} + +#undef pr_fmt + +#ifdef CONFIG_SYSFS +ssize_t cpu_show_meltdown(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN)) + return sprintf(buf, "Not affected\n"); + if (boot_cpu_has(X86_FEATURE_KAISER)) + return sprintf(buf, "Mitigation: PTI\n"); + return sprintf(buf, "Vulnerable\n"); +} + +ssize_t cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1)) + return sprintf(buf, "Not affected\n"); + return sprintf(buf, "Vulnerable\n"); +} + +ssize_t cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) + return sprintf(buf, "Not affected\n"); + + return sprintf(buf, "%s\n", spectre_v2_strings[spectre_v2_enabled]); +} +#endif diff --git a/arch/x86/kernel/cpu/bugs_64.c b/arch/x86/kernel/cpu/bugs_64.c deleted file mode 100644 index a972ac4c7e7d..000000000000 --- a/arch/x86/kernel/cpu/bugs_64.c +++ /dev/null @@ -1,33 +0,0 @@ -/* - * Copyright (C) 1994 Linus Torvalds - * Copyright (C) 2000 SuSE - */ - -#include <linux/kernel.h> -#include <linux/init.h> -#include <asm/alternative.h> -#include <asm/bugs.h> -#include <asm/processor.h> -#include <asm/mtrr.h> -#include <asm/cacheflush.h> - -void __init check_bugs(void) -{ - identify_boot_cpu(); -#if !defined(CONFIG_SMP) - pr_info("CPU: "); - print_cpu_info(&boot_cpu_data); -#endif - alternative_instructions(); - - /* - * Make sure the first 2MB area is not mapped by huge pages - * There are typically fixed size MTRRs in there and overlapping - * MTRRs into large pages causes slow downs. - * - * Right now we don't do that with gbpages because there seems - * very little benefit for that case. - */ - if (!direct_gbpages) - set_memory_4k((unsigned long)__va(0), 1); -} diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 918e44772b04..7b9ae04ddf5d 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -480,8 +480,8 @@ static const char *table_lookup_model(struct cpuinfo_x86 *c) return NULL; /* Not found */ } -__u32 cpu_caps_cleared[NCAPINTS]; -__u32 cpu_caps_set[NCAPINTS]; +__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS]; +__u32 cpu_caps_set[NCAPINTS + NBUGINTS]; void load_percpu_segment(int cpu) { @@ -706,6 +706,16 @@ void cpu_detect(struct cpuinfo_x86 *c) } } +static void apply_forced_caps(struct cpuinfo_x86 *c) +{ + int i; + + for (i = 0; i < NCAPINTS + NBUGINTS; i++) { + c->x86_capability[i] &= ~cpu_caps_cleared[i]; + c->x86_capability[i] |= cpu_caps_set[i]; + } +} + void get_cpu_cap(struct cpuinfo_x86 *c) { u32 eax, ebx, ecx, edx; @@ -872,7 +882,22 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c) } setup_force_cpu_cap(X86_FEATURE_ALWAYS); + + /* Assume for now that ALL x86 CPUs are insecure */ + setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN); + + setup_force_cpu_bug(X86_BUG_SPECTRE_V1); + setup_force_cpu_bug(X86_BUG_SPECTRE_V2); + fpu__init_system(c); + +#ifdef CONFIG_X86_32 + /* + * Regardless of whether PCID is enumerated, the SDM says + * that it can't be enabled in 32-bit mode. + */ + setup_clear_cpu_cap(X86_FEATURE_PCID); +#endif } void __init early_cpu_init(void) @@ -1086,10 +1111,7 @@ static void identify_cpu(struct cpuinfo_x86 *c) this_cpu->c_identify(c); /* Clear/Set all flags overridden by options, after probe */ - for (i = 0; i < NCAPINTS; i++) { - c->x86_capability[i] &= ~cpu_caps_cleared[i]; - c->x86_capability[i] |= cpu_caps_set[i]; - } + apply_forced_caps(c); #ifdef CONFIG_X86_64 c->apicid = apic->phys_pkg_id(c->initial_apicid, 0); @@ -1151,10 +1173,7 @@ static void identify_cpu(struct cpuinfo_x86 *c) * Clear/Set all flags overridden by options, need do it * before following smp all cpus cap AND. */ - for (i = 0; i < NCAPINTS; i++) { - c->x86_capability[i] &= ~cpu_caps_cleared[i]; - c->x86_capability[i] |= cpu_caps_set[i]; - } + apply_forced_caps(c); /* * On SMP, boot_cpu_data holds the common feature set between diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c index 13dbcc0f9d03..ac3e636ad586 100644 --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -1051,8 +1051,17 @@ static bool is_blacklisted(unsigned int cpu) { struct cpuinfo_x86 *c = &cpu_data(cpu); - if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X) { - pr_err_once("late loading on model 79 is disabled.\n"); + /* + * Late loading on model 79 with microcode revision less than 0x0b000021 + * may result in a system hang. This behavior is documented in item + * BDF90, #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family). + */ + if (c->x86 == 6 && + c->x86_model == INTEL_FAM6_BROADWELL_X && + c->x86_mask == 0x01 && + c->microcode < 0x0b000021) { + pr_err_once("Erratum BDF90: late loading with revision < 0x0b000021 (0x%x) disabled.\n", c->microcode); + pr_err_once("Please consider either early loading through initrd/built-in or a potential BIOS update.\n"); return true; } diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c index 1f38d9a4d9de..2763573ee1d2 100644 --- a/arch/x86/kernel/irq_32.c +++ b/arch/x86/kernel/irq_32.c @@ -19,6 +19,7 @@ #include <linux/mm.h> #include <asm/apic.h> +#include <asm/nospec-branch.h> #ifdef CONFIG_DEBUG_STACKOVERFLOW @@ -54,17 +55,17 @@ DEFINE_PER_CPU(struct irq_stack *, softirq_stack); static void call_on_stack(void *func, void *stack) { asm volatile("xchgl %%ebx,%%esp \n" - "call *%%edi \n" + CALL_NOSPEC "movl %%ebx,%%esp \n" : "=b" (stack) : "0" (stack), - "D"(func) + [thunk_target] "D"(func) : "memory", "cc", "edx", "ecx", "eax"); } static inline void *current_stack(void) { - return (void *)(current_stack_pointer() & ~(THREAD_SIZE - 1)); + return (void *)(current_stack_pointer & ~(THREAD_SIZE - 1)); } static inline int execute_on_irq_stack(int overflow, struct irq_desc *desc) @@ -88,17 +89,17 @@ static inline int execute_on_irq_stack(int overflow, struct irq_desc *desc) /* Save the next esp at the bottom of the stack */ prev_esp = (u32 *)irqstk; - *prev_esp = current_stack_pointer(); + *prev_esp = current_stack_pointer; if (unlikely(overflow)) call_on_stack(print_stack_overflow, isp); asm volatile("xchgl %%ebx,%%esp \n" - "call *%%edi \n" + CALL_NOSPEC "movl %%ebx,%%esp \n" : "=a" (arg1), "=b" (isp) : "0" (desc), "1" (isp), - "D" (desc->handle_irq) + [thunk_target] "D" (desc->handle_irq) : "memory", "cc", "ecx"); return 1; } @@ -139,7 +140,7 @@ void do_softirq_own_stack(void) /* Push the previous esp onto the stack */ prev_esp = (u32 *)irqstk; - *prev_esp = current_stack_pointer(); + *prev_esp = current_stack_pointer; call_on_stack(__do_softirq, isp); } diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S index 7b0d3da52fb4..287ec3bc141f 100644 --- a/arch/x86/kernel/mcount_64.S +++ b/arch/x86/kernel/mcount_64.S @@ -8,7 +8,7 @@ #include <asm/ptrace.h> #include <asm/ftrace.h> #include <asm/export.h> - +#include <asm/nospec-branch.h> .code64 .section .entry.text, "ax" @@ -290,8 +290,9 @@ trace: * ip and parent ip are used and the list function is called when * function tracing is enabled. */ - call *ftrace_trace_function + movq ftrace_trace_function, %r8 + CALL_NOSPEC %r8 restore_mcount_regs jmp fgraph_trace @@ -334,5 +335,5 @@ GLOBAL(return_to_handler) movq 8(%rsp), %rdx movq (%rsp), %rax addq $24, %rsp - jmp *%rdi + JMP_NOSPEC %rdi #endif diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index bd4e3d4d3625..322f433fbc76 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -153,7 +153,7 @@ void ist_begin_non_atomic(struct pt_regs *regs) * from double_fault. */ BUG_ON((unsigned long)(current_top_of_stack() - - current_stack_pointer()) >= THREAD_SIZE); + current_stack_pointer) >= THREAD_SIZE); preempt_enable_no_resched(); } diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 8148d8ca7930..24af898fb3a6 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -44,6 +44,7 @@ #include <asm/debugreg.h> #include <asm/kvm_para.h> #include <asm/irq_remapping.h> +#include <asm/nospec-branch.h> #include <asm/virtext.h> #include "trace.h" @@ -4868,6 +4869,25 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) "mov %%r13, %c[r13](%[svm]) \n\t" "mov %%r14, %c[r14](%[svm]) \n\t" "mov %%r15, %c[r15](%[svm]) \n\t" +#endif + /* + * Clear host registers marked as clobbered to prevent + * speculative use. + */ + "xor %%" _ASM_BX ", %%" _ASM_BX " \n\t" + "xor %%" _ASM_CX ", %%" _ASM_CX " \n\t" + "xor %%" _ASM_DX ", %%" _ASM_DX " \n\t" + "xor %%" _ASM_SI ", %%" _ASM_SI " \n\t" + "xor %%" _ASM_DI ", %%" _ASM_DI " \n\t" +#ifdef CONFIG_X86_64 + "xor %%r8, %%r8 \n\t" + "xor %%r9, %%r9 \n\t" + "xor %%r10, %%r10 \n\t" + "xor %%r11, %%r11 \n\t" + "xor %%r12, %%r12 \n\t" + "xor %%r13, %%r13 \n\t" + "xor %%r14, %%r14 \n\t" + "xor %%r15, %%r15 \n\t" #endif "pop %%" _ASM_BP : @@ -4898,6 +4918,9 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) #endif ); + /* Eliminate branch target predictions from guest mode */ + vmexit_fill_RSB(); + #ifdef CONFIG_X86_64 wrmsrl(MSR_GS_BASE, svm->host.gs_base); #else diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 263e56059fd5..3ca6d15994e4 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -48,6 +48,7 @@ #include <asm/kexec.h> #include <asm/apic.h> #include <asm/irq_remapping.h> +#include <asm/nospec-branch.h> #include "trace.h" #include "pmu.h" @@ -857,8 +858,16 @@ static inline short vmcs_field_to_offset(unsigned long field) { BUILD_BUG_ON(ARRAY_SIZE(vmcs_field_to_offset_table) > SHRT_MAX); - if (field >= ARRAY_SIZE(vmcs_field_to_offset_table) || - vmcs_field_to_offset_table[field] == 0) + if (field >= ARRAY_SIZE(vmcs_field_to_offset_table)) + return -ENOENT; + + /* + * FIXME: Mitigation for CVE-2017-5753. To be replaced with a + * generic mechanism. + */ + asm("lfence"); + + if (vmcs_field_to_offset_table[field] == 0) return -ENOENT; return vmcs_field_to_offset_table[field]; @@ -8948,6 +8957,7 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) /* Save guest registers, load host registers, keep flags */ "mov %0, %c[wordsize](%%" _ASM_SP ") \n\t" "pop %0 \n\t" + "setbe %c[fail](%0)\n\t" "mov %%" _ASM_AX ", %c[rax](%0) \n\t" "mov %%" _ASM_BX ", %c[rbx](%0) \n\t" __ASM_SIZE(pop) " %c[rcx](%0) \n\t" @@ -8964,12 +8974,23 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) "mov %%r13, %c[r13](%0) \n\t" "mov %%r14, %c[r14](%0) \n\t" "mov %%r15, %c[r15](%0) \n\t" + "xor %%r8d, %%r8d \n\t" + "xor %%r9d, %%r9d \n\t" + "xor %%r10d, %%r10d \n\t" + "xor %%r11d, %%r11d \n\t" + "xor %%r12d, %%r12d \n\t" + "xor %%r13d, %%r13d \n\t" + "xor %%r14d, %%r14d \n\t" + "xor %%r15d, %%r15d \n\t" #endif "mov %%cr2, %%" _ASM_AX " \n\t" "mov %%" _ASM_AX ", %c[cr2](%0) \n\t" + "xor %%eax, %%eax \n\t" + "xor %%ebx, %%ebx \n\t" + "xor %%esi, %%esi \n\t" + "xor %%edi, %%edi \n\t" "pop %%" _ASM_BP "; pop %%" _ASM_DX " \n\t" - "setbe %c[fail](%0) \n\t" ".pushsection .rodata \n\t" ".global vmx_return \n\t" "vmx_return: " _ASM_PTR " 2b \n\t" @@ -9006,6 +9027,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) #endif ); + /* Eliminate branch target predictions from guest mode */ + vmexit_fill_RSB(); + /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */ if (debugctlmsr) update_debugctlmsr(debugctlmsr); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 73304b1a03cc..d3f80cccb9aa 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4264,7 +4264,7 @@ static int vcpu_mmio_read(struct kvm_vcpu *vcpu, gpa_t addr, int len, void *v) addr, n, v)) && kvm_io_bus_read(vcpu, KVM_MMIO_BUS, addr, n, v)) break; - trace_kvm_mmio(KVM_TRACE_MMIO_READ, n, addr, *(u64 *)v); + trace_kvm_mmio(KVM_TRACE_MMIO_READ, n, addr, v); handled += n; addr += n; len -= n; @@ -4517,7 +4517,7 @@ static int read_prepare(struct kvm_vcpu *vcpu, void *val, int bytes) { if (vcpu->mmio_read_completed) { trace_kvm_mmio(KVM_TRACE_MMIO_READ, bytes, - vcpu->mmio_fragments[0].gpa, *(u64 *)val); + vcpu->mmio_fragments[0].gpa, val); vcpu->mmio_read_completed = 0; return 1; } @@ -4539,14 +4539,14 @@ static int write_emulate(struct kvm_vcpu *vcpu, gpa_t gpa, static int write_mmio(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes, void *val) { - trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, bytes, gpa, *(u64 *)val); + trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, bytes, gpa, val); return vcpu_mmio_write(vcpu, gpa, bytes, val); } static int read_exit_mmio(struct kvm_vcpu *vcpu, gpa_t gpa, void *val, int bytes) { - trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, bytes, gpa, 0); + trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, bytes, gpa, NULL); return X86EMUL_IO_NEEDED; } diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile index 34a74131a12c..6bf1898ddf49 100644 --- a/arch/x86/lib/Makefile +++ b/arch/x86/lib/Makefile @@ -25,6 +25,7 @@ lib-y += memcpy_$(BITS).o lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o +lib-$(CONFIG_RETPOLINE) += retpoline.o obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S index 4d34bb548b41..46e71a74e612 100644 --- a/arch/x86/lib/checksum_32.S +++ b/arch/x86/lib/checksum_32.S @@ -29,7 +29,8 @@ #include <asm/errno.h> #include <asm/asm.h> #include <asm/export.h> - +#include <asm/nospec-branch.h> + /* * computes a partial checksum, e.g. for TCP/UDP fragments */ @@ -156,7 +157,7 @@ ENTRY(csum_partial) negl %ebx lea 45f(%ebx,%ebx,2), %ebx testl %esi, %esi - jmp *%ebx + JMP_NOSPEC %ebx # Handle 2-byte-aligned regions 20: addw (%esi), %ax @@ -439,7 +440,7 @@ ENTRY(csum_partial_copy_generic) andl $-32,%edx lea 3f(%ebx,%ebx), %ebx testl %esi, %esi - jmp *%ebx + JMP_NOSPEC %ebx 1: addl $64,%esi addl $64,%edi SRC(movb -32(%edx),%bl) ; SRC(movb (%edx),%bl) diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S new file mode 100644 index 000000000000..cb45c6cb465f --- /dev/null +++ b/arch/x86/lib/retpoline.S @@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#include <linux/stringify.h> +#include <linux/linkage.h> +#include <asm/dwarf2.h> +#include <asm/cpufeatures.h> +#include <asm/alternative-asm.h> +#include <asm/export.h> +#include <asm/nospec-branch.h> + +.macro THUNK reg + .section .text.__x86.indirect_thunk.\reg + +ENTRY(__x86_indirect_thunk_\reg) + CFI_STARTPROC + JMP_NOSPEC %\reg + CFI_ENDPROC +ENDPROC(__x86_indirect_thunk_\reg) +.endm + +/* + * Despite being an assembler file we can't just use .irp here + * because __KSYM_DEPS__ only uses the C preprocessor and would + * only see one instance of "__x86_indirect_thunk_\reg" rather + * than one per register with the correct names. So we do it + * the simple and nasty way... + */ +#define EXPORT_THUNK(reg) EXPORT_SYMBOL(__x86_indirect_thunk_ ## reg) +#define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg) + +GENERATE_THUNK(_ASM_AX) +GENERATE_THUNK(_ASM_BX) +GENERATE_THUNK(_ASM_CX) +GENERATE_THUNK(_ASM_DX) +GENERATE_THUNK(_ASM_SI) +GENERATE_THUNK(_ASM_DI) +GENERATE_THUNK(_ASM_BP) +GENERATE_THUNK(_ASM_SP) +#ifdef CONFIG_64BIT +GENERATE_THUNK(r8) +GENERATE_THUNK(r9) +GENERATE_THUNK(r10) +GENERATE_THUNK(r11) +GENERATE_THUNK(r12) +GENERATE_THUNK(r13) +GENERATE_THUNK(r14) +GENERATE_THUNK(r15) +#endif diff --git a/arch/x86/mm/kaiser.c b/arch/x86/mm/kaiser.c index 8f8e5e03d083..a8ade08a9bf5 100644 --- a/arch/x86/mm/kaiser.c +++ b/arch/x86/mm/kaiser.c @@ -197,6 +197,8 @@ static int kaiser_add_user_map(const void *__start_addr, unsigned long size, * requires that not to be #defined to 0): so mask it off here. */ flags &= ~_PAGE_GLOBAL; + if (!(__supported_pte_mask & _PAGE_NX)) + flags &= ~_PAGE_NX; for (; address < end_addr; address += PAGE_SIZE) { target_address = get_pa_from_mapping(address); diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 5aaec8effc5f..209b9465e97a 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -345,13 +345,6 @@ static inline void _pgd_free(pgd_t *pgd) } #else -/* - * Instead of one pgd, Kaiser acquires two pgds. Being order-1, it is - * both 8k in size and 8k-aligned. That lets us just flip bit 12 - * in a pointer to swap between the two 4k halves. - */ -#define PGD_ALLOCATION_ORDER kaiser_enabled - static inline pgd_t *_pgd_alloc(void) { return (pgd_t *)__get_free_pages(PGALLOC_GFP, PGD_ALLOCATION_ORDER); diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 41205de487e7..578973ade71b 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -110,7 +110,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, * mapped in the new pgd, we'll double-fault. Forcibly * map it. */ - unsigned int stack_pgd_index = pgd_index(current_stack_pointer()); + unsigned int stack_pgd_index = pgd_index(current_stack_pointer); pgd_t *pgd = next->pgd + stack_pgd_index; diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c index 2f25a363068c..dcb2d9d185a2 100644 --- a/arch/x86/platform/efi/efi_64.c +++ b/arch/x86/platform/efi/efi_64.c @@ -142,7 +142,7 @@ int __init efi_alloc_page_tables(void) return 0; gfp_mask = GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO; - efi_pgd = (pgd_t *)__get_free_page(gfp_mask); + efi_pgd = (pgd_t *)__get_free_pages(gfp_mask, PGD_ALLOCATION_ORDER); if (!efi_pgd) return -ENOMEM; diff --git a/crypto/algapi.c b/crypto/algapi.c index 1fad2a6b3bbb..5c098ffa7d3d 100644 --- a/crypto/algapi.c +++ b/crypto/algapi.c @@ -167,6 +167,18 @@ void crypto_remove_spawns(struct crypto_alg *alg, struct list_head *list, spawn->alg = NULL; spawns = &inst->alg.cra_users; + + /* + * We may encounter an unregistered instance here, since + * an instance's spawns are set up prior to the instance + * being registered. An unregistered instance will have + * NULL ->cra_users.next, since ->cra_users isn't + * properly initialized until registration. But an + * unregistered instance cannot have any users, so treat + * it the same as ->cra_users being empty. + */ + if (spawns->next == NULL) + break; } } while ((spawns = crypto_more_spawns(alg, &stack, &top, &secondary_spawns))); diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index d02e7c0f5bfd..0651010bba21 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -235,6 +235,9 @@ config GENERIC_CPU_DEVICES config GENERIC_CPU_AUTOPROBE bool +config GENERIC_CPU_VULNERABILITIES + bool + config SOC_BUS bool diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index 4c28e1a09786..56b6c8508a89 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -499,10 +499,58 @@ static void __init cpu_dev_register_generic(void) #endif } +#ifdef CONFIG_GENERIC_CPU_VULNERABILITIES + +ssize_t __weak cpu_show_meltdown(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + +ssize_t __weak cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + +ssize_t __weak cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + +static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); +static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); +static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); + +static struct attribute *cpu_root_vulnerabilities_attrs[] = { + &dev_attr_meltdown.attr, + &dev_attr_spectre_v1.attr, + &dev_attr_spectre_v2.attr, + NULL +}; + +static const struct attribute_group cpu_root_vulnerabilities_group = { + .name = "vulnerabilities", + .attrs = cpu_root_vulnerabilities_attrs, +}; + +static void __init cpu_register_vulnerabilities(void) +{ + if (sysfs_create_group(&cpu_subsys.dev_root->kobj, + &cpu_root_vulnerabilities_group)) + pr_err("Unable to register CPU vulnerabilities\n"); +} + +#else +static inline void cpu_register_vulnerabilities(void) { } +#endif + void __init cpu_dev_init(void) { if (subsys_system_register(&cpu_subsys, cpu_root_attr_groups)) panic("Failed to register CPU subsystem"); cpu_dev_register_generic(); + cpu_register_vulnerabilities(); } diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 24f4b544d270..e32badd26c8a 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -4511,7 +4511,7 @@ static int rbd_init_disk(struct rbd_device *rbd_dev) segment_size = rbd_obj_bytes(&rbd_dev->header); blk_queue_max_hw_sectors(q, segment_size / SECTOR_SIZE); q->limits.max_sectors = queue_max_hw_sectors(q); - blk_queue_max_segments(q, segment_size / SECTOR_SIZE); + blk_queue_max_segments(q, USHRT_MAX); blk_queue_max_segment_size(q, segment_size); blk_queue_io_min(q, segment_size); blk_queue_io_opt(q, segment_size); diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c index fefb9d995d2c..81f5a552e32f 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c @@ -2729,6 +2729,8 @@ static int vmw_cmd_dx_view_define(struct vmw_private *dev_priv, } view_type = vmw_view_cmd_to_type(header->id); + if (view_type == vmw_view_max) + return -EINVAL; cmd = container_of(header, typeof(*cmd), header); ret = vmw_cmd_res_check(dev_priv, sw_context, vmw_res_surface, user_surface_converter, diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index e0a8216ecf2b..13c32eb40738 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -31,6 +31,7 @@ #include <linux/clockchips.h> #include <asm/hyperv.h> #include <asm/mshyperv.h> +#include <asm/nospec-branch.h> #include "hyperv_vmbus.h" /* The one and only */ @@ -103,9 +104,10 @@ u64 hv_do_hypercall(u64 control, void *input, void *output) return (u64)ULLONG_MAX; __asm__ __volatile__("mov %0, %%r8" : : "r" (output_address) : "r8"); - __asm__ __volatile__("call *%3" : "=a" (hv_status) : + __asm__ __volatile__(CALL_NOSPEC : + "=a" (hv_status) : "c" (control), "d" (input_address), - "m" (hypercall_page)); + THUNK_TARGET(hypercall_page)); return hv_status; @@ -123,11 +125,12 @@ u64 hv_do_hypercall(u64 control, void *input, void *output) if (!hypercall_page) return (u64)ULLONG_MAX; - __asm__ __volatile__ ("call *%8" : "=d"(hv_status_hi), + __asm__ __volatile__ (CALL_NOSPEC : "=d"(hv_status_hi), "=a"(hv_status_lo) : "d" (control_hi), "a" (control_lo), "b" (input_address_hi), "c" (input_address_lo), "D"(output_address_hi), - "S"(output_address_lo), "m" (hypercall_page)); + "S"(output_address_lo), + THUNK_TARGET(hypercall_page)); return hv_status_lo | ((u64)hv_status_hi << 32); #endif /* !x86_64 */ diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index b9748970df4a..29ab814693fc 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -992,8 +992,7 @@ static int srpt_init_ch_qp(struct srpt_rdma_ch *ch, struct ib_qp *qp) return -ENOMEM; attr->qp_state = IB_QPS_INIT; - attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_READ | - IB_ACCESS_REMOTE_WRITE; + attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE; attr->port_num = ch->sport->port; attr->pkey_index = 0; diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index 7643f72adb1c..3ec647e8b9c6 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -1554,7 +1554,8 @@ static unsigned long __scan(struct dm_bufio_client *c, unsigned long nr_to_scan, int l; struct dm_buffer *b, *tmp; unsigned long freed = 0; - unsigned long count = nr_to_scan; + unsigned long count = c->n_buffers[LIST_CLEAN] + + c->n_buffers[LIST_DIRTY]; unsigned long retain_target = get_retain_buffers(c); for (l = 0; l < LIST_SIZE; l++) { @@ -1591,6 +1592,7 @@ dm_bufio_shrink_count(struct shrinker *shrink, struct shrink_control *sc) { struct dm_bufio_client *c; unsigned long count; + unsigned long retain_target; c = container_of(shrink, struct dm_bufio_client, shrinker); if (sc->gfp_mask & __GFP_FS) @@ -1599,8 +1601,9 @@ dm_bufio_shrink_count(struct shrinker *shrink, struct shrink_control *sc) return 0; count = c->n_buffers[LIST_CLEAN] + c->n_buffers[LIST_DIRTY]; + retain_target = get_retain_buffers(c); dm_bufio_unlock(c); - return count; + return (count < retain_target) ? 0 : (count - retain_target); } /* diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c index eea9aea14b00..5d5012337d9e 100644 --- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -449,7 +449,7 @@ static int gs_usb_set_bittiming(struct net_device *netdev) dev_err(netdev->dev.parent, "Couldn't set bittimings (err=%d)", rc); - return rc; + return (rc > 0) ? 0 : rc; } static void gs_usb_xmit_callback(struct urb *urb) diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c index f3aaca743ea3..8a48656a376b 100644 --- a/drivers/net/ethernet/intel/e1000e/ich8lan.c +++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c @@ -1364,6 +1364,9 @@ static s32 e1000_disable_ulp_lpt_lp(struct e1000_hw *hw, bool force) * Checks to see of the link status of the hardware has changed. If a * change in link status has been detected, then we read the PHY registers * to get the current speed/duplex if link exists. + * + * Returns a negative error code (-E1000_ERR_*) or 0 (link down) or 1 (link + * up). **/ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) { @@ -1379,7 +1382,7 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) * Change or Rx Sequence Error interrupt. */ if (!mac->get_link_status) - return 0; + return 1; /* First we want to see if the MII Status Register reports * link. If so, then we want to get the current speed/duplex @@ -1611,10 +1614,12 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) * different link partner. */ ret_val = e1000e_config_fc_after_link_up(hw); - if (ret_val) + if (ret_val) { e_dbg("Error configuring flow control\n"); + return ret_val; + } - return ret_val; + return 1; } static s32 e1000_get_variants_ich8lan(struct e1000_adapter *adapter) diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c index 9e31a3390154..8aa91ddff287 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c @@ -1328,9 +1328,9 @@ mlxsw_sp_nexthop_group_refresh(struct mlxsw_sp *mlxsw_sp, static void __mlxsw_sp_nexthop_neigh_update(struct mlxsw_sp_nexthop *nh, bool removing) { - if (!removing && !nh->should_offload) + if (!removing) nh->should_offload = 1; - else if (removing && nh->offloaded) + else nh->should_offload = 0; nh->update = 1; } diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c index 2140dedab712..b6816ae00b7a 100644 --- a/drivers/net/ethernet/renesas/sh_eth.c +++ b/drivers/net/ethernet/renesas/sh_eth.c @@ -3087,18 +3087,37 @@ static int sh_eth_drv_probe(struct platform_device *pdev) /* ioremap the TSU registers */ if (mdp->cd->tsu) { struct resource *rtsu; + rtsu = platform_get_resource(pdev, IORESOURCE_MEM, 1); - mdp->tsu_addr = devm_ioremap_resource(&pdev->dev, rtsu); - if (IS_ERR(mdp->tsu_addr)) { - ret = PTR_ERR(mdp->tsu_addr); + if (!rtsu) { + dev_err(&pdev->dev, "no TSU resource\n"); + ret = -ENODEV; + goto out_release; + } + /* We can only request the TSU region for the first port + * of the two sharing this TSU for the probe to succeed... + */ + if (devno % 2 == 0 && + !devm_request_mem_region(&pdev->dev, rtsu->start, + resource_size(rtsu), + dev_name(&pdev->dev))) { + dev_err(&pdev->dev, "can't request TSU resource.\n"); + ret = -EBUSY; + goto out_release; + } + mdp->tsu_addr = devm_ioremap(&pdev->dev, rtsu->start, + resource_size(rtsu)); + if (!mdp->tsu_addr) { + dev_err(&pdev->dev, "TSU region ioremap() failed.\n"); + ret = -ENOMEM; goto out_release; } mdp->port = devno % 2; ndev->features = NETIF_F_HW_VLAN_CTAG_FILTER; } - /* initialize first or needed device */ - if (!devno || pd->needs_init) { + /* Need to init only the first port of the two sharing a TSU */ + if (devno % 2 == 0) { if (mdp->cd->chip_reset) mdp->cd->chip_reset(ndev); diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index adf61a7b1b01..98bbb91336e4 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -280,8 +280,14 @@ static void stmmac_eee_ctrl_timer(unsigned long arg) bool stmmac_eee_init(struct stmmac_priv *priv) { unsigned long flags; + int interface = priv->plat->interface; bool ret = false; + if ((interface != PHY_INTERFACE_MODE_MII) && + (interface != PHY_INTERFACE_MODE_GMII) && + !phy_interface_mode_is_rgmii(interface)) + goto out; + /* Using PCS we cannot dial with the phy registers at this stage * so we do not support extra feature like EEE. */ diff --git a/drivers/net/usb/cx82310_eth.c b/drivers/net/usb/cx82310_eth.c index e221bfcee76b..947bea81d924 100644 --- a/drivers/net/usb/cx82310_eth.c +++ b/drivers/net/usb/cx82310_eth.c @@ -293,12 +293,9 @@ static struct sk_buff *cx82310_tx_fixup(struct usbnet *dev, struct sk_buff *skb, { int len = skb->len; - if (skb_headroom(skb) < 2) { - struct sk_buff *skb2 = skb_copy_expand(skb, 2, 0, flags); + if (skb_cow_head(skb, 2)) { dev_kfree_skb_any(skb); - skb = skb2; - if (!skb) - return NULL; + return NULL; } skb_push(skb, 2); diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c index f33460cec79f..9c257ffedb15 100644 --- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -2419,14 +2419,9 @@ static struct sk_buff *lan78xx_tx_prep(struct lan78xx_net *dev, { u32 tx_cmd_a, tx_cmd_b; - if (skb_headroom(skb) < TX_OVERHEAD) { - struct sk_buff *skb2; - - skb2 = skb_copy_expand(skb, TX_OVERHEAD, 0, flags); + if (skb_cow_head(skb, TX_OVERHEAD)) { dev_kfree_skb_any(skb); - skb = skb2; - if (!skb) - return NULL; + return NULL; } if (lan78xx_linearize(skb) < 0) diff --git a/drivers/net/usb/smsc75xx.c b/drivers/net/usb/smsc75xx.c index 9af9799935db..4cb9b11a545a 100644 --- a/drivers/net/usb/smsc75xx.c +++ b/drivers/net/usb/smsc75xx.c @@ -2205,13 +2205,9 @@ static struct sk_buff *smsc75xx_tx_fixup(struct usbnet *dev, { u32 tx_cmd_a, tx_cmd_b; - if (skb_headroom(skb) < SMSC75XX_TX_OVERHEAD) { - struct sk_buff *skb2 = - skb_copy_expand(skb, SMSC75XX_TX_OVERHEAD, 0, flags); + if (skb_cow_head(skb, SMSC75XX_TX_OVERHEAD)) { dev_kfree_skb_any(skb); - skb = skb2; - if (!skb) - return NULL; + return NULL; } tx_cmd_a = (u32)(skb->len & TX_CMD_A_LEN) | TX_CMD_A_FCS; diff --git a/drivers/net/usb/sr9700.c b/drivers/net/usb/sr9700.c index 4a1e9c489f1f..aadfe1d1c37e 100644 --- a/drivers/net/usb/sr9700.c +++ b/drivers/net/usb/sr9700.c @@ -456,14 +456,9 @@ static struct sk_buff *sr9700_tx_fixup(struct usbnet *dev, struct sk_buff *skb, len = skb->len; - if (skb_headroom(skb) < SR_TX_OVERHEAD) { - struct sk_buff *skb2; - - skb2 = skb_copy_expand(skb, SR_TX_OVERHEAD, 0, flags); + if (skb_cow_head(skb, SR_TX_OVERHEAD)) { dev_kfree_skb_any(skb); - skb = skb2; - if (!skb) - return NULL; + return NULL; } __skb_push(skb, SR_TX_OVERHEAD); diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c index 0b4c1562420f..ba1fe61e6ea6 100644 --- a/drivers/net/wireless/ath/ath10k/htt_rx.c +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c @@ -548,6 +548,11 @@ static int ath10k_htt_rx_crypto_param_len(struct ath10k *ar, return IEEE80211_TKIP_IV_LEN; case HTT_RX_MPDU_ENCRYPT_AES_CCM_WPA2: return IEEE80211_CCMP_HDR_LEN; + case HTT_RX_MPDU_ENCRYPT_AES_CCM256_WPA2: + return IEEE80211_CCMP_256_HDR_LEN; + case HTT_RX_MPDU_ENCRYPT_AES_GCMP_WPA2: + case HTT_RX_MPDU_ENCRYPT_AES_GCMP256_WPA2: + return IEEE80211_GCMP_HDR_LEN; case HTT_RX_MPDU_ENCRYPT_WEP128: case HTT_RX_MPDU_ENCRYPT_WAPI: break; @@ -573,6 +578,11 @@ static int ath10k_htt_rx_crypto_tail_len(struct ath10k *ar, return IEEE80211_TKIP_ICV_LEN; case HTT_RX_MPDU_ENCRYPT_AES_CCM_WPA2: return IEEE80211_CCMP_MIC_LEN; + case HTT_RX_MPDU_ENCRYPT_AES_CCM256_WPA2: + return IEEE80211_CCMP_256_MIC_LEN; + case HTT_RX_MPDU_ENCRYPT_AES_GCMP_WPA2: + case HTT_RX_MPDU_ENCRYPT_AES_GCMP256_WPA2: + return IEEE80211_GCMP_MIC_LEN; case HTT_RX_MPDU_ENCRYPT_WEP128: case HTT_RX_MPDU_ENCRYPT_WAPI: break; @@ -1024,9 +1034,21 @@ static void ath10k_htt_rx_h_undecap_raw(struct ath10k *ar, hdr = (void *)msdu->data; /* Tail */ - if (status->flag & RX_FLAG_IV_STRIPPED) + if (status->flag & RX_FLAG_IV_STRIPPED) { skb_trim(msdu, msdu->len - ath10k_htt_rx_crypto_tail_len(ar, enctype)); + } else { + /* MIC */ + if ((status->flag & RX_FLAG_MIC_STRIPPED) && + enctype == HTT_RX_MPDU_ENCRYPT_AES_CCM_WPA2) + skb_trim(msdu, msdu->len - 8); + + /* ICV */ + if (status->flag & RX_FLAG_ICV_STRIPPED && + enctype != HTT_RX_MPDU_ENCRYPT_AES_CCM_WPA2) + skb_trim(msdu, msdu->len - + ath10k_htt_rx_crypto_tail_len(ar, enctype)); + } /* MMIC */ if ((status->flag & RX_FLAG_MMIC_STRIPPED) && @@ -1048,7 +1070,8 @@ static void ath10k_htt_rx_h_undecap_raw(struct ath10k *ar, static void ath10k_htt_rx_h_undecap_nwifi(struct ath10k *ar, struct sk_buff *msdu, struct ieee80211_rx_status *status, - const u8 first_hdr[64]) + const u8 first_hdr[64], + enum htt_rx_mpdu_encrypt_type enctype) { struct ieee80211_hdr *hdr; struct htt_rx_desc *rxd; @@ -1056,6 +1079,7 @@ static void ath10k_htt_rx_h_undecap_nwifi(struct ath10k *ar, u8 da[ETH_ALEN]; u8 sa[ETH_ALEN]; int l3_pad_bytes; + int bytes_aligned = ar->hw_params.decap_align_bytes; /* Delivered decapped frame: * [nwifi 802.11 header] <-- replaced with 802.11 hdr @@ -1084,6 +1108,14 @@ static void ath10k_htt_rx_h_undecap_nwifi(struct ath10k *ar, /* push original 802.11 header */ hdr = (struct ieee80211_hdr *)first_hdr; hdr_len = ieee80211_hdrlen(hdr->frame_control); + + if (!(status->flag & RX_FLAG_IV_STRIPPED)) { + memcpy(skb_push(msdu, + ath10k_htt_rx_crypto_param_len(ar, enctype)), + (void *)hdr + round_up(hdr_len, bytes_aligned), + ath10k_htt_rx_crypto_param_len(ar, enctype)); + } + memcpy(skb_push(msdu, hdr_len), hdr, hdr_len); /* original 802.11 header has a different DA and in @@ -1144,6 +1176,7 @@ static void ath10k_htt_rx_h_undecap_eth(struct ath10k *ar, u8 sa[ETH_ALEN]; int l3_pad_bytes; struct htt_rx_desc *rxd; + int bytes_aligned = ar->hw_params.decap_align_bytes; /* Delivered decapped frame: * [eth header] <-- replaced with 802.11 hdr & rfc1042/llc @@ -1172,6 +1205,14 @@ static void ath10k_htt_rx_h_undecap_eth(struct ath10k *ar, /* push original 802.11 header */ hdr = (struct ieee80211_hdr *)first_hdr; hdr_len = ieee80211_hdrlen(hdr->frame_control); + + if (!(status->flag & RX_FLAG_IV_STRIPPED)) { + memcpy(skb_push(msdu, + ath10k_htt_rx_crypto_param_len(ar, enctype)), + (void *)hdr + round_up(hdr_len, bytes_aligned), + ath10k_htt_rx_crypto_param_len(ar, enctype)); + } + memcpy(skb_push(msdu, hdr_len), hdr, hdr_len); /* original 802.11 header has a different DA and in @@ -1185,12 +1226,14 @@ static void ath10k_htt_rx_h_undecap_eth(struct ath10k *ar, static void ath10k_htt_rx_h_undecap_snap(struct ath10k *ar, struct sk_buff *msdu, struct ieee80211_rx_status *status, - const u8 first_hdr[64]) + const u8 first_hdr[64], + enum htt_rx_mpdu_encrypt_type enctype) { struct ieee80211_hdr *hdr; size_t hdr_len; int l3_pad_bytes; struct htt_rx_desc *rxd; + int bytes_aligned = ar->hw_params.decap_align_bytes; /* Delivered decapped frame: * [amsdu header] <-- replaced with 802.11 hdr @@ -1206,6 +1249,14 @@ static void ath10k_htt_rx_h_undecap_snap(struct ath10k *ar, hdr = (struct ieee80211_hdr *)first_hdr; hdr_len = ieee80211_hdrlen(hdr->frame_control); + + if (!(status->flag & RX_FLAG_IV_STRIPPED)) { + memcpy(skb_push(msdu, + ath10k_htt_rx_crypto_param_len(ar, enctype)), + (void *)hdr + round_up(hdr_len, bytes_aligned), + ath10k_htt_rx_crypto_param_len(ar, enctype)); + } + memcpy(skb_push(msdu, hdr_len), hdr, hdr_len); } @@ -1240,13 +1291,15 @@ static void ath10k_htt_rx_h_undecap(struct ath10k *ar, is_decrypted); break; case RX_MSDU_DECAP_NATIVE_WIFI: - ath10k_htt_rx_h_undecap_nwifi(ar, msdu, status, first_hdr); + ath10k_htt_rx_h_undecap_nwifi(ar, msdu, status, first_hdr, + enctype); break; case RX_MSDU_DECAP_ETHERNET2_DIX: ath10k_htt_rx_h_undecap_eth(ar, msdu, status, first_hdr, enctype); break; case RX_MSDU_DECAP_8023_SNAP_LLC: - ath10k_htt_rx_h_undecap_snap(ar, msdu, status, first_hdr); + ath10k_htt_rx_h_undecap_snap(ar, msdu, status, first_hdr, + enctype); break; } } @@ -1289,7 +1342,8 @@ static void ath10k_htt_rx_h_csum_offload(struct sk_buff *msdu) static void ath10k_htt_rx_h_mpdu(struct ath10k *ar, struct sk_buff_head *amsdu, - struct ieee80211_rx_status *status) + struct ieee80211_rx_status *status, + bool fill_crypt_header) { struct sk_buff *first; struct sk_buff *last; @@ -1299,7 +1353,6 @@ static void ath10k_htt_rx_h_mpdu(struct ath10k *ar, enum htt_rx_mpdu_encrypt_type enctype; u8 first_hdr[64]; u8 *qos; - size_t hdr_len; bool has_fcs_err; bool has_crypto_err; bool has_tkip_err; @@ -1324,15 +1377,17 @@ static void ath10k_htt_rx_h_mpdu(struct ath10k *ar, * decapped header. It'll be used for undecapping of each MSDU. */ hdr = (void *)rxd->rx_hdr_status; - hdr_len = ieee80211_hdrlen(hdr->frame_control); - memcpy(first_hdr, hdr, hdr_len); + memcpy(first_hdr, hdr, RX_HTT_HDR_STATUS_LEN); /* Each A-MSDU subframe will use the original header as the base and be * reported as a separate MSDU so strip the A-MSDU bit from QoS Ctl. */ hdr = (void *)first_hdr; - qos = ieee80211_get_qos_ctl(hdr); - qos[0] &= ~IEEE80211_QOS_CTL_A_MSDU_PRESENT; + + if (ieee80211_is_data_qos(hdr->frame_control)) { + qos = ieee80211_get_qos_ctl(hdr); + qos[0] &= ~IEEE80211_QOS_CTL_A_MSDU_PRESENT; + } /* Some attention flags are valid only in the last MSDU. */ last = skb_peek_tail(amsdu); @@ -1379,9 +1434,14 @@ static void ath10k_htt_rx_h_mpdu(struct ath10k *ar, status->flag |= RX_FLAG_DECRYPTED; if (likely(!is_mgmt)) - status->flag |= RX_FLAG_IV_STRIPPED | - RX_FLAG_MMIC_STRIPPED; -} + status->flag |= RX_FLAG_MMIC_STRIPPED; + + if (fill_crypt_header) + status->flag |= RX_FLAG_MIC_STRIPPED | + RX_FLAG_ICV_STRIPPED; + else + status->flag |= RX_FLAG_IV_STRIPPED; + } skb_queue_walk(amsdu, msdu) { ath10k_htt_rx_h_csum_offload(msdu); @@ -1397,6 +1457,9 @@ static void ath10k_htt_rx_h_mpdu(struct ath10k *ar, if (is_mgmt) continue; + if (fill_crypt_header) + continue; + hdr = (void *)msdu->data; hdr->frame_control &= ~__cpu_to_le16(IEEE80211_FCTL_PROTECTED); } @@ -1407,6 +1470,9 @@ static void ath10k_htt_rx_h_deliver(struct ath10k *ar, struct ieee80211_rx_status *status) { struct sk_buff *msdu; + struct sk_buff *first_subframe; + + first_subframe = skb_peek(amsdu); while ((msdu = __skb_dequeue(amsdu))) { /* Setup per-MSDU flags */ @@ -1415,6 +1481,13 @@ static void ath10k_htt_rx_h_deliver(struct ath10k *ar, else status->flag |= RX_FLAG_AMSDU_MORE; + if (msdu == first_subframe) { + first_subframe = NULL; + status->flag &= ~RX_FLAG_ALLOW_SAME_PN; + } else { + status->flag |= RX_FLAG_ALLOW_SAME_PN; + } + ath10k_process_rx(ar, status, msdu); } } @@ -1557,7 +1630,7 @@ static int ath10k_htt_rx_handle_amsdu(struct ath10k_htt *htt) ath10k_htt_rx_h_ppdu(ar, &amsdu, rx_status, 0xffff); ath10k_htt_rx_h_unchain(ar, &amsdu, ret > 0); ath10k_htt_rx_h_filter(ar, &amsdu, rx_status); - ath10k_htt_rx_h_mpdu(ar, &amsdu, rx_status); + ath10k_htt_rx_h_mpdu(ar, &amsdu, rx_status, true); ath10k_htt_rx_h_deliver(ar, &amsdu, rx_status); return num_msdus; @@ -1892,7 +1965,7 @@ static int ath10k_htt_rx_in_ord_ind(struct ath10k *ar, struct sk_buff *skb) num_msdus += skb_queue_len(&amsdu); ath10k_htt_rx_h_ppdu(ar, &amsdu, status, vdev_id); ath10k_htt_rx_h_filter(ar, &amsdu, status); - ath10k_htt_rx_h_mpdu(ar, &amsdu, status); + ath10k_htt_rx_h_mpdu(ar, &amsdu, status, false); ath10k_htt_rx_h_deliver(ar, &amsdu, status); break; case -EAGAIN: diff --git a/drivers/net/wireless/ath/ath10k/rx_desc.h b/drivers/net/wireless/ath/ath10k/rx_desc.h index 034e7a54c5b2..e4878d0044bf 100644 --- a/drivers/net/wireless/ath/ath10k/rx_desc.h +++ b/drivers/net/wireless/ath/ath10k/rx_desc.h @@ -239,6 +239,9 @@ enum htt_rx_mpdu_encrypt_type { HTT_RX_MPDU_ENCRYPT_WAPI = 5, HTT_RX_MPDU_ENCRYPT_AES_CCM_WPA2 = 6, HTT_RX_MPDU_ENCRYPT_NONE = 7, + HTT_RX_MPDU_ENCRYPT_AES_CCM256_WPA2 = 8, + HTT_RX_MPDU_ENCRYPT_AES_GCMP_WPA2 = 9, + HTT_RX_MPDU_ENCRYPT_AES_GCMP256_WPA2 = 10, }; #define RX_MPDU_START_INFO0_PEER_IDX_MASK 0x000007ff diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c index ceeb8c188ef3..00d82e8443bd 100644 --- a/drivers/platform/x86/wmi.c +++ b/drivers/platform/x86/wmi.c @@ -848,5 +848,5 @@ static void __exit acpi_wmi_exit(void) pr_info("Mapper unloaded\n"); } -subsys_initcall(acpi_wmi_init); +subsys_initcall_sync(acpi_wmi_init); module_exit(acpi_wmi_exit); diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index 2b770cb0c488..558a66b459fa 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -774,10 +774,12 @@ static long ashmem_ioctl(struct file *file, unsigned int cmd, unsigned long arg) break; case ASHMEM_SET_SIZE: ret = -EINVAL; + mutex_lock(&ashmem_mutex); if (!asma->file) { ret = 0; asma->size = (size_t)arg; } + mutex_unlock(&ashmem_mutex); break; case ASHMEM_GET_SIZE: ret = asma->size; diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 72e926d9868f..04d2b6e25503 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -1940,7 +1940,6 @@ iscsit_handle_task_mgt_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd, struct iscsi_tmr_req *tmr_req; struct iscsi_tm *hdr; int out_of_order_cmdsn = 0, ret; - bool sess_ref = false; u8 function, tcm_function = TMR_UNKNOWN; hdr = (struct iscsi_tm *) buf; @@ -1982,18 +1981,17 @@ iscsit_handle_task_mgt_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd, buf); } + transport_init_se_cmd(&cmd->se_cmd, &iscsi_ops, + conn->sess->se_sess, 0, DMA_NONE, + TCM_SIMPLE_TAG, cmd->sense_buffer + 2); + + target_get_sess_cmd(&cmd->se_cmd, true); + /* * TASK_REASSIGN for ERL=2 / connection stays inside of * LIO-Target $FABRIC_MOD */ if (function != ISCSI_TM_FUNC_TASK_REASSIGN) { - transport_init_se_cmd(&cmd->se_cmd, &iscsi_ops, - conn->sess->se_sess, 0, DMA_NONE, - TCM_SIMPLE_TAG, cmd->sense_buffer + 2); - - target_get_sess_cmd(&cmd->se_cmd, true); - sess_ref = true; - switch (function) { case ISCSI_TM_FUNC_ABORT_TASK: tcm_function = TMR_ABORT_TASK; @@ -2132,12 +2130,8 @@ iscsit_handle_task_mgt_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd, * For connection recovery, this is also the default action for * TMR TASK_REASSIGN. */ - if (sess_ref) { - pr_debug("Handle TMR, using sess_ref=true check\n"); - target_put_sess_cmd(&cmd->se_cmd); - } - iscsit_add_cmd_to_response_queue(cmd, conn, cmd->i_state); + target_put_sess_cmd(&cmd->se_cmd); return 0; } EXPORT_SYMBOL(iscsit_handle_task_mgt_cmd); diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c index 27dd1e12f246..14bb2db5273c 100644 --- a/drivers/target/target_core_tmr.c +++ b/drivers/target/target_core_tmr.c @@ -133,6 +133,15 @@ static bool __target_check_io_state(struct se_cmd *se_cmd, spin_unlock(&se_cmd->t_state_lock); return false; } + if (se_cmd->transport_state & CMD_T_PRE_EXECUTE) { + if (se_cmd->scsi_status) { + pr_debug("Attempted to abort io tag: %llu early failure" + " status: 0x%02x\n", se_cmd->tag, + se_cmd->scsi_status); + spin_unlock(&se_cmd->t_state_lock); + return false; + } + } if (sess->sess_tearing_down || se_cmd->cmd_wait_set) { pr_debug("Attempted to abort io tag: %llu already shutdown," " skipping\n", se_cmd->tag); diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index 4c0782cb1e94..6f3eccf986c7 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -1939,6 +1939,7 @@ void target_execute_cmd(struct se_cmd *cmd) } cmd->t_state = TRANSPORT_PROCESSING; + cmd->transport_state &= ~CMD_T_PRE_EXECUTE; cmd->transport_state |= CMD_T_ACTIVE|CMD_T_BUSY|CMD_T_SENT; spin_unlock_irq(&cmd->t_state_lock); @@ -2592,6 +2593,7 @@ int target_get_sess_cmd(struct se_cmd *se_cmd, bool ack_kref) ret = -ESHUTDOWN; goto out; } + se_cmd->transport_state |= CMD_T_PRE_EXECUTE; list_add_tail(&se_cmd->se_cmd_list, &se_sess->sess_cmd_list); out: spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 82eea55a7b5c..3b7d69ca83be 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -1086,7 +1086,8 @@ int xhci_alloc_virt_device(struct xhci_hcd *xhci, int slot_id, return 1; fail: - + if (dev->eps[0].ring) + xhci_ring_free(xhci, dev->eps[0].ring); if (dev->in_ctx) xhci_free_container_ctx(xhci, dev->in_ctx); if (dev->out_ctx) diff --git a/drivers/usb/misc/usb3503.c b/drivers/usb/misc/usb3503.c index 8e7737d7ac0a..03be5d574f23 100644 --- a/drivers/usb/misc/usb3503.c +++ b/drivers/usb/misc/usb3503.c @@ -292,6 +292,8 @@ static int usb3503_probe(struct usb3503 *hub) if (gpio_is_valid(hub->gpio_reset)) { err = devm_gpio_request_one(dev, hub->gpio_reset, GPIOF_OUT_INIT_LOW, "usb3503 reset"); + /* Datasheet defines a hardware reset to be at least 100us */ + usleep_range(100, 10000); if (err) { dev_err(dev, "unable to request GPIO %d as reset pin (%d)\n", diff --git a/drivers/usb/mon/mon_bin.c b/drivers/usb/mon/mon_bin.c index 1a874a1f3890..80b37d214beb 100644 --- a/drivers/usb/mon/mon_bin.c +++ b/drivers/usb/mon/mon_bin.c @@ -1002,7 +1002,9 @@ static long mon_bin_ioctl(struct file *file, unsigned int cmd, unsigned long arg break; case MON_IOCQ_RING_SIZE: + mutex_lock(&rp->fetch_lock); ret = rp->b_size; + mutex_unlock(&rp->fetch_lock); break; case MON_IOCT_RING_SIZE: @@ -1229,12 +1231,16 @@ static int mon_bin_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf) unsigned long offset, chunk_idx; struct page *pageptr; + mutex_lock(&rp->fetch_lock); offset = vmf->pgoff << PAGE_SHIFT; - if (offset >= rp->b_size) + if (offset >= rp->b_size) { + mutex_unlock(&rp->fetch_lock); return VM_FAULT_SIGBUS; + } chunk_idx = offset / CHUNK_SIZE; pageptr = rp->b_vec[chunk_idx].pg; get_page(pageptr); + mutex_unlock(&rp->fetch_lock); vmf->page = pageptr; return 0; } diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c index 11ee55e080e5..3178d8afb3e6 100644 --- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -121,6 +121,7 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x10C4, 0x8470) }, /* Juniper Networks BX Series System Console */ { USB_DEVICE(0x10C4, 0x8477) }, /* Balluff RFID */ { USB_DEVICE(0x10C4, 0x84B6) }, /* Starizona Hyperion */ + { USB_DEVICE(0x10C4, 0x85A7) }, /* LifeScan OneTouch Verio IQ */ { USB_DEVICE(0x10C4, 0x85EA) }, /* AC-Services IBUS-IF */ { USB_DEVICE(0x10C4, 0x85EB) }, /* AC-Services CIS-IBUS */ { USB_DEVICE(0x10C4, 0x85F8) }, /* Virtenio Preon32 */ @@ -171,6 +172,7 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x1843, 0x0200) }, /* Vaisala USB Instrument Cable */ { USB_DEVICE(0x18EF, 0xE00F) }, /* ELV USB-I2C-Interface */ { USB_DEVICE(0x18EF, 0xE025) }, /* ELV Marble Sound Board 1 */ + { USB_DEVICE(0x18EF, 0xE030) }, /* ELV ALC 8xxx Battery Charger */ { USB_DEVICE(0x18EF, 0xE032) }, /* ELV TFD500 Data Logger */ { USB_DEVICE(0x1901, 0x0190) }, /* GE B850 CP2105 Recorder interface */ { USB_DEVICE(0x1901, 0x0193) }, /* GE B650 CP2104 PMC interface */ diff --git a/drivers/usb/storage/unusual_uas.h b/drivers/usb/storage/unusual_uas.h index 9f356f7cf7d5..719ec68ae309 100644 --- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -156,6 +156,13 @@ UNUSUAL_DEV(0x2109, 0x0711, 0x0000, 0x9999, USB_SC_DEVICE, USB_PR_DEVICE, NULL, US_FL_NO_ATA_1X), +/* Reported-by: Icenowy Zheng <icenowy(a)aosc.io> */ +UNUSUAL_DEV(0x2537, 0x1068, 0x0000, 0x9999, + "Norelsys", + "NS1068X", + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_IGNORE_UAS), + /* Reported-by: Takeo Nakayama <javhera(a)gmx.com> */ UNUSUAL_DEV(0x357d, 0x7788, 0x0000, 0x9999, "JMicron", diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c index e24b24fa0f16..2a5d3180777d 100644 --- a/drivers/usb/usbip/usbip_common.c +++ b/drivers/usb/usbip/usbip_common.c @@ -105,7 +105,7 @@ static void usbip_dump_usb_device(struct usb_device *udev) dev_dbg(dev, " devnum(%d) devpath(%s) usb speed(%s)", udev->devnum, udev->devpath, usb_speed_string(udev->speed)); - pr_debug("tt %p, ttport %d\n", udev->tt, udev->ttport); + pr_debug("tt hub ttport %d\n", udev->ttport); dev_dbg(dev, " "); for (i = 0; i < 16; i++) @@ -138,12 +138,8 @@ static void usbip_dump_usb_device(struct usb_device *udev) } pr_debug("\n"); - dev_dbg(dev, "parent %p, bus %p\n", udev->parent, udev->bus); - - dev_dbg(dev, - "descriptor %p, config %p, actconfig %p, rawdescriptors %p\n", - &udev->descriptor, udev->config, - udev->actconfig, udev->rawdescriptors); + dev_dbg(dev, "parent %s, bus %s\n", dev_name(&udev->parent->dev), + udev->bus->bus_name); dev_dbg(dev, "have_langid %d, string_langid %d\n", udev->have_langid, udev->string_langid); @@ -251,9 +247,6 @@ void usbip_dump_urb(struct urb *urb) dev = &urb->dev->dev; - dev_dbg(dev, " urb :%p\n", urb); - dev_dbg(dev, " dev :%p\n", urb->dev); - usbip_dump_usb_device(urb->dev); dev_dbg(dev, " pipe :%08x ", urb->pipe); @@ -262,11 +255,9 @@ void usbip_dump_urb(struct urb *urb) dev_dbg(dev, " status :%d\n", urb->status); dev_dbg(dev, " transfer_flags :%08X\n", urb->transfer_flags); - dev_dbg(dev, " transfer_buffer :%p\n", urb->transfer_buffer); dev_dbg(dev, " transfer_buffer_length:%d\n", urb->transfer_buffer_length); dev_dbg(dev, " actual_length :%d\n", urb->actual_length); - dev_dbg(dev, " setup_packet :%p\n", urb->setup_packet); if (urb->setup_packet && usb_pipetype(urb->pipe) == PIPE_CONTROL) usbip_dump_usb_ctrlrequest( @@ -276,8 +267,6 @@ void usbip_dump_urb(struct urb *urb) dev_dbg(dev, " number_of_packets :%d\n", urb->number_of_packets); dev_dbg(dev, " interval :%d\n", urb->interval); dev_dbg(dev, " error_count :%d\n", urb->error_count); - dev_dbg(dev, " context :%p\n", urb->context); - dev_dbg(dev, " complete :%p\n", urb->complete); } EXPORT_SYMBOL_GPL(usbip_dump_urb); diff --git a/drivers/usb/usbip/vudc_rx.c b/drivers/usb/usbip/vudc_rx.c index e429b59f6f8a..d020e72b3122 100644 --- a/drivers/usb/usbip/vudc_rx.c +++ b/drivers/usb/usbip/vudc_rx.c @@ -132,6 +132,25 @@ static int v_recv_cmd_submit(struct vudc *udc, urb_p->new = 1; urb_p->seqnum = pdu->base.seqnum; + if (urb_p->ep->type == USB_ENDPOINT_XFER_ISOC) { + /* validate packet size and number of packets */ + unsigned int maxp, packets, bytes; + + maxp = usb_endpoint_maxp(urb_p->ep->desc); + maxp *= usb_endpoint_maxp_mult(urb_p->ep->desc); + bytes = pdu->u.cmd_submit.transfer_buffer_length; + packets = DIV_ROUND_UP(bytes, maxp); + + if (pdu->u.cmd_submit.number_of_packets < 0 || + pdu->u.cmd_submit.number_of_packets > packets) { + dev_err(&udc->gadget.dev, + "CMD_SUBMIT: isoc invalid num packets %d\n", + pdu->u.cmd_submit.number_of_packets); + ret = -EMSGSIZE; + goto free_urbp; + } + } + ret = alloc_urb_from_cmd(&urb_p->urb, pdu, urb_p->ep->type); if (ret) { usbip_event_add(&udc->ud, VUDC_EVENT_ERROR_MALLOC); diff --git a/drivers/usb/usbip/vudc_tx.c b/drivers/usb/usbip/vudc_tx.c index 234661782fa0..3ab4c86486a7 100644 --- a/drivers/usb/usbip/vudc_tx.c +++ b/drivers/usb/usbip/vudc_tx.c @@ -97,6 +97,13 @@ static int v_send_ret_submit(struct vudc *udc, struct urbp *urb_p) memset(&pdu_header, 0, sizeof(pdu_header)); memset(&msg, 0, sizeof(msg)); + if (urb->actual_length > 0 && !urb->transfer_buffer) { + dev_err(&udc->gadget.dev, + "urb: actual_length %d transfer_buffer null\n", + urb->actual_length); + return -1; + } + if (urb_p->type == USB_ENDPOINT_XFER_ISOC) iovnum = 2 + urb->number_of_packets; else @@ -112,8 +119,8 @@ static int v_send_ret_submit(struct vudc *udc, struct urbp *urb_p) /* 1. setup usbip_header */ setup_ret_submit_pdu(&pdu_header, urb_p); - usbip_dbg_stub_tx("setup txdata seqnum: %d urb: %p\n", - pdu_header.base.seqnum, urb); + usbip_dbg_stub_tx("setup txdata seqnum: %d\n", + pdu_header.base.seqnum); usbip_header_correct_endian(&pdu_header, 1); iov[iovnum].iov_base = &pdu_header; diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 97498be2ca2e..75ffd3b2149e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -43,6 +43,7 @@ struct bpf_map { u32 max_entries; u32 map_flags; u32 pages; + bool unpriv_array; struct user_struct *user; const struct bpf_map_ops *ops; struct work_struct work; @@ -189,6 +190,7 @@ struct bpf_prog_aux { struct bpf_array { struct bpf_map map; u32 elem_size; + u32 index_mask; /* 'ownership' of prog_array is claimed by the first program that * is going to use this map or by the first program which FD is stored * in the map to make sure that all callers and callees have the same diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 4c4e9358c146..070fc49e39e2 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -67,7 +67,10 @@ struct bpf_verifier_state_list { }; struct bpf_insn_aux_data { - enum bpf_reg_type ptr_type; /* pointer type for load/store insns */ + union { + enum bpf_reg_type ptr_type; /* pointer type for load/store insns */ + struct bpf_map *map_ptr; /* pointer for call insn into lookup_elem */ + }; bool seen; /* this insn was processed by the verifier */ }; diff --git a/include/linux/cpu.h b/include/linux/cpu.h index e571128ad99a..2f475ad89a0d 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -44,6 +44,13 @@ extern void cpu_remove_dev_attr(struct device_attribute *attr); extern int cpu_add_dev_attr_group(struct attribute_group *attrs); extern void cpu_remove_dev_attr_group(struct attribute_group *attrs); +extern ssize_t cpu_show_meltdown(struct device *dev, + struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf); + extern __printf(4, 5) struct device *cpu_device_create(struct device *parent, void *drvdata, const struct attribute_group **groups, diff --git a/include/linux/frame.h b/include/linux/frame.h index e6baaba3f1ae..d772c61c31da 100644 --- a/include/linux/frame.h +++ b/include/linux/frame.h @@ -11,7 +11,7 @@ * For more information, see tools/objtool/Documentation/stack-validation.txt. */ #define STACK_FRAME_NON_STANDARD(func) \ - static void __used __section(__func_stack_frame_non_standard) \ + static void __used __section(.discard.func_stack_frame_non_standard) \ *__func_stack_frame_non_standard_##func = func #else /* !CONFIG_STACK_VALIDATION */ diff --git a/include/linux/phy.h b/include/linux/phy.h index a04d69ab7c34..867110c9d707 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -683,6 +683,17 @@ static inline bool phy_is_internal(struct phy_device *phydev) return phydev->is_internal; } +/** + * phy_interface_mode_is_rgmii - Convenience function for testing if a + * PHY interface mode is RGMII (all variants) + * @mode: the phy_interface_t enum + */ +static inline bool phy_interface_mode_is_rgmii(phy_interface_t mode) +{ + return mode >= PHY_INTERFACE_MODE_RGMII && + mode <= PHY_INTERFACE_MODE_RGMII_TXID; +}; + /** * phy_interface_is_rgmii - Convenience function for testing if a PHY interface * is RGMII (all variants) diff --git a/include/linux/sh_eth.h b/include/linux/sh_eth.h index f2e27e078362..01b3778ba6da 100644 --- a/include/linux/sh_eth.h +++ b/include/linux/sh_eth.h @@ -16,7 +16,6 @@ struct sh_eth_plat_data { unsigned char mac_addr[ETH_ALEN]; unsigned no_ether_link:1; unsigned ether_link_active_low:1; - unsigned needs_init:1; }; #endif diff --git a/include/net/mac80211.h b/include/net/mac80211.h index 2c7d876e2a1a..8fd61bc50383 100644 --- a/include/net/mac80211.h +++ b/include/net/mac80211.h @@ -1007,7 +1007,7 @@ ieee80211_tx_info_clear_status(struct ieee80211_tx_info *info) * @RX_FLAG_DECRYPTED: This frame was decrypted in hardware. * @RX_FLAG_MMIC_STRIPPED: the Michael MIC is stripped off this frame, * verification has been done by the hardware. - * @RX_FLAG_IV_STRIPPED: The IV/ICV are stripped from this frame. + * @RX_FLAG_IV_STRIPPED: The IV and ICV are stripped from this frame. * If this flag is set, the stack cannot do any replay detection * hence the driver or hardware will have to do that. * @RX_FLAG_PN_VALIDATED: Currently only valid for CCMP/GCMP frames, this @@ -1078,6 +1078,8 @@ ieee80211_tx_info_clear_status(struct ieee80211_tx_info *info) * @RX_FLAG_ALLOW_SAME_PN: Allow the same PN as same packet before. * This is used for AMSDU subframes which can have the same PN as * the first subframe. + * @RX_FLAG_ICV_STRIPPED: The ICV is stripped from this frame. CRC checking must + * be done in the hardware. */ enum mac80211_rx_flags { RX_FLAG_MMIC_ERROR = BIT(0), @@ -1113,6 +1115,7 @@ enum mac80211_rx_flags { RX_FLAG_RADIOTAP_VENDOR_DATA = BIT(31), RX_FLAG_MIC_STRIPPED = BIT_ULL(32), RX_FLAG_ALLOW_SAME_PN = BIT_ULL(33), + RX_FLAG_ICV_STRIPPED = BIT_ULL(34), }; #define RX_FLAG_STBC_SHIFT 26 diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h index eb3b23b6ec54..30f99ce4c6ce 100644 --- a/include/target/target_core_base.h +++ b/include/target/target_core_base.h @@ -493,6 +493,7 @@ struct se_cmd { #define CMD_T_BUSY (1 << 9) #define CMD_T_TAS (1 << 10) #define CMD_T_FABRIC_STOP (1 << 11) +#define CMD_T_PRE_EXECUTE (1 << 12) spinlock_t t_state_lock; struct kref cmd_kref; struct completion t_transport_stop_comp; diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h index 8ade3eb6c640..90fce4d6956a 100644 --- a/include/trace/events/kvm.h +++ b/include/trace/events/kvm.h @@ -208,7 +208,7 @@ TRACE_EVENT(kvm_ack_irq, { KVM_TRACE_MMIO_WRITE, "write" } TRACE_EVENT(kvm_mmio, - TP_PROTO(int type, int len, u64 gpa, u64 val), + TP_PROTO(int type, int len, u64 gpa, void *val), TP_ARGS(type, len, gpa, val), TP_STRUCT__entry( @@ -222,7 +222,10 @@ TRACE_EVENT(kvm_mmio, __entry->type = type; __entry->len = len; __entry->gpa = gpa; - __entry->val = val; + __entry->val = 0; + if (val) + memcpy(&__entry->val, val, + min_t(u32, sizeof(__entry->val), len)); ), TP_printk("mmio %s len %u gpa 0x%llx val 0x%llx", diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index f3721e150d94..9a1e6ed7babc 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -46,9 +46,10 @@ static int bpf_array_alloc_percpu(struct bpf_array *array) static struct bpf_map *array_map_alloc(union bpf_attr *attr) { bool percpu = attr->map_type == BPF_MAP_TYPE_PERCPU_ARRAY; + u32 elem_size, index_mask, max_entries; + bool unpriv = !capable(CAP_SYS_ADMIN); struct bpf_array *array; - u64 array_size; - u32 elem_size; + u64 array_size, mask64; /* check sanity of attributes */ if (attr->max_entries == 0 || attr->key_size != 4 || @@ -63,11 +64,32 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) elem_size = round_up(attr->value_size, 8); + max_entries = attr->max_entries; + + /* On 32 bit archs roundup_pow_of_two() with max_entries that has + * upper most bit set in u32 space is undefined behavior due to + * resulting 1U << 32, so do it manually here in u64 space. + */ + mask64 = fls_long(max_entries - 1); + mask64 = 1ULL << mask64; + mask64 -= 1; + + index_mask = mask64; + if (unpriv) { + /* round up array size to nearest power of 2, + * since cpu will speculate within index_mask limits + */ + max_entries = index_mask + 1; + /* Check for overflows. */ + if (max_entries < attr->max_entries) + return ERR_PTR(-E2BIG); + } + array_size = sizeof(*array); if (percpu) - array_size += (u64) attr->max_entries * sizeof(void *); + array_size += (u64) max_entries * sizeof(void *); else - array_size += (u64) attr->max_entries * elem_size; + array_size += (u64) max_entries * elem_size; /* make sure there is no u32 overflow later in round_up() */ if (array_size >= U32_MAX - PAGE_SIZE) @@ -77,6 +99,8 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) array = bpf_map_area_alloc(array_size); if (!array) return ERR_PTR(-ENOMEM); + array->index_mask = index_mask; + array->map.unpriv_array = unpriv; /* copy mandatory map attributes */ array->map.map_type = attr->map_type; @@ -110,7 +134,7 @@ static void *array_map_lookup_elem(struct bpf_map *map, void *key) if (unlikely(index >= array->map.max_entries)) return NULL; - return array->value + array->elem_size * index; + return array->value + array->elem_size * (index & array->index_mask); } /* Called from eBPF program */ @@ -122,7 +146,7 @@ static void *percpu_array_map_lookup_elem(struct bpf_map *map, void *key) if (unlikely(index >= array->map.max_entries)) return NULL; - return this_cpu_ptr(array->pptrs[index]); + return this_cpu_ptr(array->pptrs[index & array->index_mask]); } int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value) @@ -142,7 +166,7 @@ int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value) */ size = round_up(map->value_size, 8); rcu_read_lock(); - pptr = array->pptrs[index]; + pptr = array->pptrs[index & array->index_mask]; for_each_possible_cpu(cpu) { bpf_long_memcpy(value + off, per_cpu_ptr(pptr, cpu), size); off += size; @@ -190,10 +214,11 @@ static int array_map_update_elem(struct bpf_map *map, void *key, void *value, return -EEXIST; if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY) - memcpy(this_cpu_ptr(array->pptrs[index]), + memcpy(this_cpu_ptr(array->pptrs[index & array->index_mask]), value, map->value_size); else - memcpy(array->value + array->elem_size * index, + memcpy(array->value + + array->elem_size * (index & array->index_mask), value, map->value_size); return 0; } @@ -227,7 +252,7 @@ int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value, */ size = round_up(map->value_size, 8); rcu_read_lock(); - pptr = array->pptrs[index]; + pptr = array->pptrs[index & array->index_mask]; for_each_possible_cpu(cpu) { bpf_long_memcpy(per_cpu_ptr(pptr, cpu), value + off, size); off += size; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 72ea91df71c9..91a2d3752007 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -565,57 +565,6 @@ void bpf_register_prog_type(struct bpf_prog_type_list *tl) list_add(&tl->list_node, &bpf_prog_types); } -/* fixup insn->imm field of bpf_call instructions: - * if (insn->imm == BPF_FUNC_map_lookup_elem) - * insn->imm = bpf_map_lookup_elem - __bpf_call_base; - * else if (insn->imm == BPF_FUNC_map_update_elem) - * insn->imm = bpf_map_update_elem - __bpf_call_base; - * else ... - * - * this function is called after eBPF program passed verification - */ -static void fixup_bpf_calls(struct bpf_prog *prog) -{ - const struct bpf_func_proto *fn; - int i; - - for (i = 0; i < prog->len; i++) { - struct bpf_insn *insn = &prog->insnsi[i]; - - if (insn->code == (BPF_JMP | BPF_CALL)) { - /* we reach here when program has bpf_call instructions - * and it passed bpf_check(), means that - * ops->get_func_proto must have been supplied, check it - */ - BUG_ON(!prog->aux->ops->get_func_proto); - - if (insn->imm == BPF_FUNC_get_route_realm) - prog->dst_needed = 1; - if (insn->imm == BPF_FUNC_get_prandom_u32) - bpf_user_rnd_init_once(); - if (insn->imm == BPF_FUNC_tail_call) { - /* mark bpf_tail_call as different opcode - * to avoid conditional branch in - * interpeter for every normal call - * and to prevent accidental JITing by - * JIT compiler that doesn't support - * bpf_tail_call yet - */ - insn->imm = 0; - insn->code |= BPF_X; - continue; - } - - fn = prog->aux->ops->get_func_proto(insn->imm); - /* all functions that have prototype and verifier allowed - * programs to call them, must be real in-kernel functions - */ - BUG_ON(!fn->func); - insn->imm = fn->func - __bpf_call_base; - } - } -} - /* drop refcnt on maps used by eBPF program and free auxilary data */ static void free_used_maps(struct bpf_prog_aux *aux) { @@ -808,9 +757,6 @@ static int bpf_prog_load(union bpf_attr *attr) if (err < 0) goto free_used_maps; - /* fixup BPF_CALL->imm field */ - fixup_bpf_calls(prog); - /* eBPF program is ready to be JITed */ prog = bpf_prog_select_runtime(prog, &err); if (err < 0) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index d7eeebfafe8d..19c44cf59bb2 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1187,7 +1187,7 @@ static void clear_all_pkt_pointers(struct bpf_verifier_env *env) } } -static int check_call(struct bpf_verifier_env *env, int func_id) +static int check_call(struct bpf_verifier_env *env, int func_id, int insn_idx) { struct bpf_verifier_state *state = &env->cur_state; const struct bpf_func_proto *fn = NULL; @@ -1238,6 +1238,13 @@ static int check_call(struct bpf_verifier_env *env, int func_id) err = check_func_arg(env, BPF_REG_2, fn->arg2_type, &meta); if (err) return err; + if (func_id == BPF_FUNC_tail_call) { + if (meta.map_ptr == NULL) { + verbose("verifier bug\n"); + return -EINVAL; + } + env->insn_aux_data[insn_idx].map_ptr = meta.map_ptr; + } err = check_func_arg(env, BPF_REG_3, fn->arg3_type, &meta); if (err) return err; @@ -3019,7 +3026,7 @@ static int do_check(struct bpf_verifier_env *env) return -EINVAL; } - err = check_call(env, insn->imm); + err = check_call(env, insn->imm, insn_idx); if (err) return err; @@ -3362,6 +3369,81 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) return 0; } +/* fixup insn->imm field of bpf_call instructions + * + * this function is called after eBPF program passed verification + */ +static int fixup_bpf_calls(struct bpf_verifier_env *env) +{ + struct bpf_prog *prog = env->prog; + struct bpf_insn *insn = prog->insnsi; + const struct bpf_func_proto *fn; + const int insn_cnt = prog->len; + struct bpf_insn insn_buf[16]; + struct bpf_prog *new_prog; + struct bpf_map *map_ptr; + int i, cnt, delta = 0; + + + for (i = 0; i < insn_cnt; i++, insn++) { + if (insn->code != (BPF_JMP | BPF_CALL)) + continue; + + if (insn->imm == BPF_FUNC_get_route_realm) + prog->dst_needed = 1; + if (insn->imm == BPF_FUNC_get_prandom_u32) + bpf_user_rnd_init_once(); + if (insn->imm == BPF_FUNC_tail_call) { + /* mark bpf_tail_call as different opcode to avoid + * conditional branch in the interpeter for every normal + * call and to prevent accidental JITing by JIT compiler + * that doesn't support bpf_tail_call yet + */ + insn->imm = 0; + insn->code |= BPF_X; + + /* instead of changing every JIT dealing with tail_call + * emit two extra insns: + * if (index >= max_entries) goto out; + * index &= array->index_mask; + * to avoid out-of-bounds cpu speculation + */ + map_ptr = env->insn_aux_data[i + delta].map_ptr; + if (!map_ptr->unpriv_array) + continue; + insn_buf[0] = BPF_JMP_IMM(BPF_JGE, BPF_REG_3, + map_ptr->max_entries, 2); + insn_buf[1] = BPF_ALU32_IMM(BPF_AND, BPF_REG_3, + container_of(map_ptr, + struct bpf_array, + map)->index_mask); + insn_buf[2] = *insn; + cnt = 3; + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); + if (!new_prog) + return -ENOMEM; + + delta += cnt - 1; + env->prog = prog = new_prog; + insn = new_prog->insnsi + i + delta; + continue; + } + + fn = prog->aux->ops->get_func_proto(insn->imm); + /* all functions that have prototype and verifier allowed + * programs to call them, must be real in-kernel functions + */ + if (!fn->func) { + verbose("kernel subsystem misconfigured func %d\n", + insn->imm); + return -EFAULT; + } + insn->imm = fn->func - __bpf_call_base; + } + + return 0; +} + static void free_states(struct bpf_verifier_env *env) { struct bpf_verifier_state_list *sl, *sln; @@ -3463,6 +3545,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr) /* program is valid, convert *(u32*)(ctx + off) accesses */ ret = convert_ctx_accesses(env); + if (ret == 0) + ret = fixup_bpf_calls(env); + if (log_level && log_len >= log_size - 1) { BUG_ON(log_len >= log_size); /* verifier log exceeded user supplied buffer */ diff --git a/mm/zswap.c b/mm/zswap.c index dbef27822a98..ded051e3433d 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -752,18 +752,22 @@ static int __zswap_param_set(const char *val, const struct kernel_param *kp, pool = zswap_pool_find_get(type, compressor); if (pool) { zswap_pool_debug("using existing", pool); + WARN_ON(pool == zswap_pool_current()); list_del_rcu(&pool->list); - } else { - spin_unlock(&zswap_pools_lock); - pool = zswap_pool_create(type, compressor); - spin_lock(&zswap_pools_lock); } + spin_unlock(&zswap_pools_lock); + + if (!pool) + pool = zswap_pool_create(type, compressor); + if (pool) ret = param_set_charp(s, kp); else ret = -EINVAL; + spin_lock(&zswap_pools_lock); + if (!ret) { put_pool = zswap_pool_current(); list_add_rcu(&pool->list, &zswap_pools); diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c index 4a47074d1d7f..c8ea3cf9db85 100644 --- a/net/8021q/vlan.c +++ b/net/8021q/vlan.c @@ -111,12 +111,7 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head) vlan_gvrp_uninit_applicant(real_dev); } - /* Take it out of our own structures, but be sure to interlock with - * HW accelerating devices or SW vlan input packet processing if - * VLAN is not 0 (leave it there for 802.1p). - */ - if (vlan_id) - vlan_vid_del(real_dev, vlan->vlan_proto, vlan_id); + vlan_vid_del(real_dev, vlan->vlan_proto, vlan_id); /* Get rid of the vlan's reference to real_dev */ dev_put(real_dev); diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index ffd09c1675d4..2bbca23a9d05 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -3353,9 +3353,10 @@ static int l2cap_parse_conf_req(struct l2cap_chan *chan, void *data, size_t data break; case L2CAP_CONF_EFS: - remote_efs = 1; - if (olen == sizeof(efs)) + if (olen == sizeof(efs)) { + remote_efs = 1; memcpy(&efs, (void *) val, olen); + } break; case L2CAP_CONF_EWS: @@ -3574,16 +3575,17 @@ static int l2cap_parse_conf_rsp(struct l2cap_chan *chan, void *rsp, int len, break; case L2CAP_CONF_EFS: - if (olen == sizeof(efs)) + if (olen == sizeof(efs)) { memcpy(&efs, (void *)val, olen); - if (chan->local_stype != L2CAP_SERV_NOTRAFIC && - efs.stype != L2CAP_SERV_NOTRAFIC && - efs.stype != chan->local_stype) - return -ECONNREFUSED; + if (chan->local_stype != L2CAP_SERV_NOTRAFIC && + efs.stype != L2CAP_SERV_NOTRAFIC && + efs.stype != chan->local_stype) + return -ECONNREFUSED; - l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs), - (unsigned long) &efs, endptr - ptr); + l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs), + (unsigned long) &efs, endptr - ptr); + } break; case L2CAP_CONF_FCS: diff --git a/net/core/ethtool.c b/net/core/ethtool.c index e9989b835a66..7913771ec474 100644 --- a/net/core/ethtool.c +++ b/net/core/ethtool.c @@ -742,15 +742,6 @@ static int ethtool_set_link_ksettings(struct net_device *dev, return dev->ethtool_ops->set_link_ksettings(dev, &link_ksettings); } -static void -warn_incomplete_ethtool_legacy_settings_conversion(const char *details) -{ - char name[sizeof(current->comm)]; - - pr_info_once("warning: `%s' uses legacy ethtool link settings API, %s\n", - get_task_comm(name, current), details); -} - /* Query device for its ethtool_cmd settings. * * Backward compatibility note: for compatibility with legacy ethtool, @@ -777,10 +768,8 @@ static int ethtool_get_settings(struct net_device *dev, void __user *useraddr) &link_ksettings); if (err < 0) return err; - if (!convert_link_ksettings_to_legacy_settings(&cmd, - &link_ksettings)) - warn_incomplete_ethtool_legacy_settings_conversion( - "link modes are only partially reported"); + convert_link_ksettings_to_legacy_settings(&cmd, + &link_ksettings); /* send a sensible cmd tag back to user */ cmd.cmd = ETHTOOL_GSET; diff --git a/net/core/sock_diag.c b/net/core/sock_diag.c index 6b10573cc9fa..d1d9faf3046b 100644 --- a/net/core/sock_diag.c +++ b/net/core/sock_diag.c @@ -295,7 +295,7 @@ static int sock_diag_bind(struct net *net, int group) case SKNLGRP_INET6_UDP_DESTROY: if (!sock_diag_handlers[AF_INET6]) request_module("net-pf-%d-proto-%d-type-%d", PF_NETLINK, - NETLINK_SOCK_DIAG, AF_INET); + NETLINK_SOCK_DIAG, AF_INET6); break; } return 0; diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 506efba33a89..388584b8ff31 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1800,9 +1800,10 @@ struct sk_buff *ip6_make_skb(struct sock *sk, cork.base.opt = NULL; v6_cork.opt = NULL; err = ip6_setup_cork(sk, &cork, &v6_cork, ipc6, rt, fl6); - if (err) + if (err) { + ip6_cork_release(&cork, &v6_cork); return ERR_PTR(err); - + } if (ipc6->dontfrag < 0) ipc6->dontfrag = inet6_sk(sk)->dontfrag; diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 11d22d642488..131e6aa954bc 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1080,10 +1080,11 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield, memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr)); neigh_release(neigh); } - } else if (!(t->parms.flags & - (IP6_TNL_F_USE_ORIG_TCLASS | IP6_TNL_F_USE_ORIG_FWMARK))) { - /* enable the cache only only if the routing decision does - * not depend on the current inner header value + } else if (t->parms.proto != 0 && !(t->parms.flags & + (IP6_TNL_F_USE_ORIG_TCLASS | + IP6_TNL_F_USE_ORIG_FWMARK))) { + /* enable the cache only if neither the outer protocol nor the + * routing decision depends on the current inner header value */ use_cache = true; } diff --git a/net/mac80211/wep.c b/net/mac80211/wep.c index efa3f48f1ec5..73e8f347802e 100644 --- a/net/mac80211/wep.c +++ b/net/mac80211/wep.c @@ -293,7 +293,8 @@ ieee80211_crypto_wep_decrypt(struct ieee80211_rx_data *rx) return RX_DROP_UNUSABLE; ieee80211_wep_remove_iv(rx->local, rx->skb, rx->key); /* remove ICV */ - if (pskb_trim(rx->skb, rx->skb->len - IEEE80211_WEP_ICV_LEN)) + if (!(status->flag & RX_FLAG_ICV_STRIPPED) && + pskb_trim(rx->skb, rx->skb->len - IEEE80211_WEP_ICV_LEN)) return RX_DROP_UNUSABLE; } diff --git a/net/mac80211/wpa.c b/net/mac80211/wpa.c index 5c71d60f3a64..caa5986cb2e4 100644 --- a/net/mac80211/wpa.c +++ b/net/mac80211/wpa.c @@ -295,7 +295,8 @@ ieee80211_crypto_tkip_decrypt(struct ieee80211_rx_data *rx) return RX_DROP_UNUSABLE; /* Trim ICV */ - skb_trim(skb, skb->len - IEEE80211_TKIP_ICV_LEN); + if (!(status->flag & RX_FLAG_ICV_STRIPPED)) + skb_trim(skb, skb->len - IEEE80211_TKIP_ICV_LEN); /* Remove IV */ memmove(skb->data + IEEE80211_TKIP_IV_LEN, skb->data, hdrlen); diff --git a/net/rds/rdma.c b/net/rds/rdma.c index de8496e60735..f6027f41cd34 100644 --- a/net/rds/rdma.c +++ b/net/rds/rdma.c @@ -524,6 +524,9 @@ int rds_rdma_extra_size(struct rds_rdma_args *args) local_vec = (struct rds_iovec __user *)(unsigned long) args->local_vec_addr; + if (args->nr_local == 0) + return -EINVAL; + /* figure out the number of pages in the vector */ for (i = 0; i < args->nr_local; i++) { if (copy_from_user(&vec, &local_vec[i], @@ -873,6 +876,7 @@ int rds_cmsg_atomic(struct rds_sock *rs, struct rds_message *rm, err: if (page) put_page(page); + rm->atomic.op_active = 0; kfree(rm->atomic.op_notifier); return ret; diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c index e0aa30f83c6c..9617b42aaf20 100644 --- a/net/sched/act_gact.c +++ b/net/sched/act_gact.c @@ -161,7 +161,7 @@ static void tcf_gact_stats_update(struct tc_action *a, u64 bytes, u32 packets, if (action == TC_ACT_SHOT) this_cpu_ptr(gact->common.cpu_qstats)->drops += packets; - tm->lastuse = lastuse; + tm->lastuse = max_t(u64, tm->lastuse, lastuse); } static int tcf_gact_dump(struct sk_buff *skb, struct tc_action *a, diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index 6b07fba5770b..fc3650b06192 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -211,7 +211,7 @@ static void tcf_stats_update(struct tc_action *a, u64 bytes, u32 packets, struct tcf_t *tm = &m->tcf_tm; _bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets); - tm->lastuse = lastuse; + tm->lastuse = max_t(u64, tm->lastuse, lastuse); } static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind, diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c index bd8349759095..845eb9b800f3 100644 --- a/scripts/mod/modpost.c +++ b/scripts/mod/modpost.c @@ -838,6 +838,7 @@ static const char *const section_white_list[] = ".cmem*", /* EZchip */ ".fmt_slot*", /* EZchip */ ".gnu.lto*", + ".discard.*", NULL }; diff --git a/scripts/module-common.lds b/scripts/module-common.lds index 53234e85192a..9b6e246a45d0 100644 --- a/scripts/module-common.lds +++ b/scripts/module-common.lds @@ -4,7 +4,10 @@ * combine them automatically. */ SECTIONS { - /DISCARD/ : { *(.discard) } + /DISCARD/ : { + *(.discard) + *(.discard.*) + } __ksymtab 0 : { *(SORT(___ksymtab+*)) } __ksymtab_gpl 0 : { *(SORT(___ksymtab_gpl+*)) } diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c index ebc9fdfe64df..3321348fd86b 100644 --- a/sound/core/oss/pcm_oss.c +++ b/sound/core/oss/pcm_oss.c @@ -466,7 +466,6 @@ static int snd_pcm_hw_param_near(struct snd_pcm_substream *pcm, v = snd_pcm_hw_param_last(pcm, params, var, dir); else v = snd_pcm_hw_param_first(pcm, params, var, dir); - snd_BUG_ON(v < 0); return v; } @@ -1370,8 +1369,11 @@ static ssize_t snd_pcm_oss_write1(struct snd_pcm_substream *substream, const cha if ((tmp = snd_pcm_oss_make_ready(substream)) < 0) return tmp; - mutex_lock(&runtime->oss.params_lock); while (bytes > 0) { + if (mutex_lock_interruptible(&runtime->oss.params_lock)) { + tmp = -ERESTARTSYS; + break; + } if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) { tmp = bytes; if (tmp + runtime->oss.buffer_used > runtime->oss.period_bytes) @@ -1415,14 +1417,18 @@ static ssize_t snd_pcm_oss_write1(struct snd_pcm_substream *substream, const cha xfer += tmp; if ((substream->f_flags & O_NONBLOCK) != 0 && tmp != runtime->oss.period_bytes) - break; + tmp = -EAGAIN; } - } - mutex_unlock(&runtime->oss.params_lock); - return xfer; - err: - mutex_unlock(&runtime->oss.params_lock); + mutex_unlock(&runtime->oss.params_lock); + if (tmp < 0) + break; + if (signal_pending(current)) { + tmp = -ERESTARTSYS; + break; + } + tmp = 0; + } return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp; } @@ -1470,8 +1476,11 @@ static ssize_t snd_pcm_oss_read1(struct snd_pcm_substream *substream, char __use if ((tmp = snd_pcm_oss_make_ready(substream)) < 0) return tmp; - mutex_lock(&runtime->oss.params_lock); while (bytes > 0) { + if (mutex_lock_interruptible(&runtime->oss.params_lock)) { + tmp = -ERESTARTSYS; + break; + } if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) { if (runtime->oss.buffer_used == 0) { tmp = snd_pcm_oss_read2(substream, runtime->oss.buffer, runtime->oss.period_bytes, 1); @@ -1502,12 +1511,16 @@ static ssize_t snd_pcm_oss_read1(struct snd_pcm_substream *substream, char __use bytes -= tmp; xfer += tmp; } - } - mutex_unlock(&runtime->oss.params_lock); - return xfer; - err: - mutex_unlock(&runtime->oss.params_lock); + mutex_unlock(&runtime->oss.params_lock); + if (tmp < 0) + break; + if (signal_pending(current)) { + tmp = -ERESTARTSYS; + break; + } + tmp = 0; + } return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp; } diff --git a/sound/core/oss/pcm_plugin.c b/sound/core/oss/pcm_plugin.c index 727ac44d39f4..a84a1d3d23e5 100644 --- a/sound/core/oss/pcm_plugin.c +++ b/sound/core/oss/pcm_plugin.c @@ -591,18 +591,26 @@ snd_pcm_sframes_t snd_pcm_plug_write_transfer(struct snd_pcm_substream *plug, st snd_pcm_sframes_t frames = size; plugin = snd_pcm_plug_first(plug); - while (plugin && frames > 0) { + while (plugin) { + if (frames <= 0) + return frames; if ((next = plugin->next) != NULL) { snd_pcm_sframes_t frames1 = frames; - if (plugin->dst_frames) + if (plugin->dst_frames) { frames1 = plugin->dst_frames(plugin, frames); + if (frames1 <= 0) + return frames1; + } if ((err = next->client_channels(next, frames1, &dst_channels)) < 0) { return err; } if (err != frames1) { frames = err; - if (plugin->src_frames) + if (plugin->src_frames) { frames = plugin->src_frames(plugin, frames1); + if (frames <= 0) + return frames; + } } } else dst_channels = NULL; diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index c80d80e312e3..e685e779a4b8 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -1664,7 +1664,7 @@ int snd_pcm_hw_param_first(struct snd_pcm_substream *pcm, return changed; if (params->rmask) { int err = snd_pcm_hw_refine(pcm, params); - if (snd_BUG_ON(err < 0)) + if (err < 0) return err; } return snd_pcm_hw_param_value(params, var, dir); @@ -1711,7 +1711,7 @@ int snd_pcm_hw_param_last(struct snd_pcm_substream *pcm, return changed; if (params->rmask) { int err = snd_pcm_hw_refine(pcm, params); - if (snd_BUG_ON(err < 0)) + if (err < 0) return err; } return snd_pcm_hw_param_value(params, var, dir); diff --git a/sound/drivers/aloop.c b/sound/drivers/aloop.c index 54f348a4fb78..cbd20cb8ca11 100644 --- a/sound/drivers/aloop.c +++ b/sound/drivers/aloop.c @@ -39,6 +39,7 @@ #include <sound/core.h> #include <sound/control.h> #include <sound/pcm.h> +#include <sound/pcm_params.h> #include <sound/info.h> #include <sound/initval.h> @@ -305,19 +306,6 @@ static int loopback_trigger(struct snd_pcm_substream *substream, int cmd) return 0; } -static void params_change_substream(struct loopback_pcm *dpcm, - struct snd_pcm_runtime *runtime) -{ - struct snd_pcm_runtime *dst_runtime; - - if (dpcm == NULL || dpcm->substream == NULL) - return; - dst_runtime = dpcm->substream->runtime; - if (dst_runtime == NULL) - return; - dst_runtime->hw = dpcm->cable->hw; -} - static void params_change(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; @@ -329,10 +317,6 @@ static void params_change(struct snd_pcm_substream *substream) cable->hw.rate_max = runtime->rate; cable->hw.channels_min = runtime->channels; cable->hw.channels_max = runtime->channels; - params_change_substream(cable->streams[SNDRV_PCM_STREAM_PLAYBACK], - runtime); - params_change_substream(cable->streams[SNDRV_PCM_STREAM_CAPTURE], - runtime); } static int loopback_prepare(struct snd_pcm_substream *substream) @@ -620,26 +604,29 @@ static unsigned int get_cable_index(struct snd_pcm_substream *substream) static int rule_format(struct snd_pcm_hw_params *params, struct snd_pcm_hw_rule *rule) { + struct loopback_pcm *dpcm = rule->private; + struct loopback_cable *cable = dpcm->cable; + struct snd_mask m; - struct snd_pcm_hardware *hw = rule->private; - struct snd_mask *maskp = hw_param_mask(params, rule->var); - - maskp->bits[0] &= (u_int32_t)hw->formats; - maskp->bits[1] &= (u_int32_t)(hw->formats >> 32); - memset(maskp->bits + 2, 0, (SNDRV_MASK_MAX-64) / 8); /* clear rest */ - if (! maskp->bits[0] && ! maskp->bits[1]) - return -EINVAL; - return 0; + snd_mask_none(&m); + mutex_lock(&dpcm->loopback->cable_lock); + m.bits[0] = (u_int32_t)cable->hw.formats; + m.bits[1] = (u_int32_t)(cable->hw.formats >> 32); + mutex_unlock(&dpcm->loopback->cable_lock); + return snd_mask_refine(hw_param_mask(params, rule->var), &m); } static int rule_rate(struct snd_pcm_hw_params *params, struct snd_pcm_hw_rule *rule) { - struct snd_pcm_hardware *hw = rule->private; + struct loopback_pcm *dpcm = rule->private; + struct loopback_cable *cable = dpcm->cable; struct snd_interval t; - t.min = hw->rate_min; - t.max = hw->rate_max; + mutex_lock(&dpcm->loopback->cable_lock); + t.min = cable->hw.rate_min; + t.max = cable->hw.rate_max; + mutex_unlock(&dpcm->loopback->cable_lock); t.openmin = t.openmax = 0; t.integer = 0; return snd_interval_refine(hw_param_interval(params, rule->var), &t); @@ -648,22 +635,44 @@ static int rule_rate(struct snd_pcm_hw_params *params, static int rule_channels(struct snd_pcm_hw_params *params, struct snd_pcm_hw_rule *rule) { - struct snd_pcm_hardware *hw = rule->private; + struct loopback_pcm *dpcm = rule->private; + struct loopback_cable *cable = dpcm->cable; struct snd_interval t; - t.min = hw->channels_min; - t.max = hw->channels_max; + mutex_lock(&dpcm->loopback->cable_lock); + t.min = cable->hw.channels_min; + t.max = cable->hw.channels_max; + mutex_unlock(&dpcm->loopback->cable_lock); t.openmin = t.openmax = 0; t.integer = 0; return snd_interval_refine(hw_param_interval(params, rule->var), &t); } +static void free_cable(struct snd_pcm_substream *substream) +{ + struct loopback *loopback = substream->private_data; + int dev = get_cable_index(substream); + struct loopback_cable *cable; + + cable = loopback->cables[substream->number][dev]; + if (!cable) + return; + if (cable->streams[!substream->stream]) { + /* other stream is still alive */ + cable->streams[substream->stream] = NULL; + } else { + /* free the cable */ + loopback->cables[substream->number][dev] = NULL; + kfree(cable); + } +} + static int loopback_open(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct loopback *loopback = substream->private_data; struct loopback_pcm *dpcm; - struct loopback_cable *cable; + struct loopback_cable *cable = NULL; int err = 0; int dev = get_cable_index(substream); @@ -682,7 +691,6 @@ static int loopback_open(struct snd_pcm_substream *substream) if (!cable) { cable = kzalloc(sizeof(*cable), GFP_KERNEL); if (!cable) { - kfree(dpcm); err = -ENOMEM; goto unlock; } @@ -700,19 +708,19 @@ static int loopback_open(struct snd_pcm_substream *substream) /* are cached -> they do not reflect the actual state */ err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_FORMAT, - rule_format, &runtime->hw, + rule_format, dpcm, SNDRV_PCM_HW_PARAM_FORMAT, -1); if (err < 0) goto unlock; err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_RATE, - rule_rate, &runtime->hw, + rule_rate, dpcm, SNDRV_PCM_HW_PARAM_RATE, -1); if (err < 0) goto unlock; err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS, - rule_channels, &runtime->hw, + rule_channels, dpcm, SNDRV_PCM_HW_PARAM_CHANNELS, -1); if (err < 0) goto unlock; @@ -724,6 +732,10 @@ static int loopback_open(struct snd_pcm_substream *substream) else runtime->hw = cable->hw; unlock: + if (err < 0) { + free_cable(substream); + kfree(dpcm); + } mutex_unlock(&loopback->cable_lock); return err; } @@ -732,20 +744,10 @@ static int loopback_close(struct snd_pcm_substream *substream) { struct loopback *loopback = substream->private_data; struct loopback_pcm *dpcm = substream->runtime->private_data; - struct loopback_cable *cable; - int dev = get_cable_index(substream); loopback_timer_stop(dpcm); mutex_lock(&loopback->cable_lock); - cable = loopback->cables[substream->number][dev]; - if (cable->streams[!substream->stream]) { - /* other stream is still alive */ - cable->streams[substream->stream] = NULL; - } else { - /* free the cable */ - loopback->cables[substream->number][dev] = NULL; - kfree(cable); - } + free_cable(substream); mutex_unlock(&loopback->cable_lock); return 0; } diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c index b8dadb050d2b..a688a857a7ae 100644 --- a/tools/objtool/builtin-check.c +++ b/tools/objtool/builtin-check.c @@ -51,7 +51,7 @@ struct instruction { unsigned int len, state; unsigned char type; unsigned long immediate; - bool alt_group, visited; + bool alt_group, visited, ignore_alts; struct symbol *call_dest; struct instruction *jump_dest; struct list_head alts; @@ -352,6 +352,40 @@ static void add_ignores(struct objtool_file *file) } } +/* + * FIXME: For now, just ignore any alternatives which add retpolines. This is + * a temporary hack, as it doesn't allow ORC to unwind from inside a retpoline. + * But it at least allows objtool to understand the control flow *around* the + * retpoline. + */ +static int add_nospec_ignores(struct objtool_file *file) +{ + struct section *sec; + struct rela *rela; + struct instruction *insn; + + sec = find_section_by_name(file->elf, ".rela.discard.nospec"); + if (!sec) + return 0; + + list_for_each_entry(rela, &sec->rela_list, list) { + if (rela->sym->type != STT_SECTION) { + WARN("unexpected relocation symbol type in %s", sec->name); + return -1; + } + + insn = find_insn(file, rela->sym->sec, rela->addend); + if (!insn) { + WARN("bad .discard.nospec entry"); + return -1; + } + + insn->ignore_alts = true; + } + + return 0; +} + /* * Find the destination instructions for all jumps. */ @@ -382,6 +416,13 @@ static int add_jump_destinations(struct objtool_file *file) } else if (rela->sym->sec->idx) { dest_sec = rela->sym->sec; dest_off = rela->sym->sym.st_value + rela->addend + 4; + } else if (strstr(rela->sym->name, "_indirect_thunk_")) { + /* + * Retpoline jumps are really dynamic jumps in + * disguise, so convert them accordingly. + */ + insn->type = INSN_JUMP_DYNAMIC; + continue; } else { /* sibling call */ insn->jump_dest = 0; @@ -428,11 +469,18 @@ static int add_call_destinations(struct objtool_file *file) dest_off = insn->offset + insn->len + insn->immediate; insn->call_dest = find_symbol_by_offset(insn->sec, dest_off); + /* + * FIXME: Thanks to retpolines, it's now considered + * normal for a function to call within itself. So + * disable this warning for now. + */ +#if 0 if (!insn->call_dest) { WARN_FUNC("can't find call dest symbol at offset 0x%lx", insn->sec, insn->offset, dest_off); return -1; } +#endif } else if (rela->sym->type == STT_SECTION) { insn->call_dest = find_symbol_by_offset(rela->sym->sec, rela->addend+4); @@ -594,12 +642,6 @@ static int add_special_section_alts(struct objtool_file *file) return ret; list_for_each_entry_safe(special_alt, tmp, &special_alts, list) { - alt = malloc(sizeof(*alt)); - if (!alt) { - WARN("malloc failed"); - ret = -1; - goto out; - } orig_insn = find_insn(file, special_alt->orig_sec, special_alt->orig_off); @@ -610,6 +652,10 @@ static int add_special_section_alts(struct objtool_file *file) goto out; } + /* Ignore retpoline alternatives. */ + if (orig_insn->ignore_alts) + continue; + new_insn = NULL; if (!special_alt->group || special_alt->new_len) { new_insn = find_insn(file, special_alt->new_sec, @@ -635,6 +681,13 @@ static int add_special_section_alts(struct objtool_file *file) goto out; } + alt = malloc(sizeof(*alt)); + if (!alt) { + WARN("malloc failed"); + ret = -1; + goto out; + } + alt->insn = new_insn; list_add_tail(&alt->list, &orig_insn->alts); @@ -854,6 +907,10 @@ static int decode_sections(struct objtool_file *file) add_ignores(file); + ret = add_nospec_ignores(file); + if (ret) + return ret; + ret = add_jump_destinations(file); if (ret) return ret; @@ -1173,6 +1230,14 @@ static int validate_uncallable_instructions(struct objtool_file *file) for_each_insn(file, insn) { if (!insn->visited && insn->type == INSN_RETURN) { + + /* + * Don't warn about call instructions in unvisited + * retpoline alternatives. + */ + if (!strcmp(insn->sec->name, ".altinstr_replacement")) + continue; + WARN_FUNC("return instruction outside of a callable function", insn->sec, insn->offset); warnings++; @@ -1229,7 +1294,7 @@ int cmd_check(int argc, const char **argv) INIT_LIST_HEAD(&file.insn_list); hash_init(file.insn_hash); - file.whitelist = find_section_by_name(file.elf, "__func_stack_frame_non_standard"); + file.whitelist = find_section_by_name(file.elf, ".discard.func_stack_frame_non_standard"); file.rodata = find_section_by_name(file.elf, ".rodata"); file.ignore_unreachables = false; file.c_file = find_section_by_name(file.elf, ".comment"); diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile index 6300c1a41ff6..4af37bfe4aea 100644 --- a/tools/testing/selftests/x86/Makefile +++ b/tools/testing/selftests/x86/Makefile @@ -6,7 +6,7 @@ include ../lib.mk TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall test_mremap_vdso \ check_initial_reg_state sigreturn ldt_gdt iopl mpx-mini-test \ - protection_keys + protection_keys test_vsyscall TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso unwind_vdso \ test_FCMOV test_FCOMI test_FISTTP \ vdso_restorer diff --git a/tools/testing/selftests/x86/test_vsyscall.c b/tools/testing/selftests/x86/test_vsyscall.c new file mode 100644 index 000000000000..6e0bd52ad53d --- /dev/null +++ b/tools/testing/selftests/x86/test_vsyscall.c @@ -0,0 +1,500 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#define _GNU_SOURCE + +#include <stdio.h> +#include <sys/time.h> +#include <time.h> +#include <stdlib.h> +#include <sys/syscall.h> +#include <unistd.h> +#include <dlfcn.h> +#include <string.h> +#include <inttypes.h> +#include <signal.h> +#include <sys/ucontext.h> +#include <errno.h> +#include <err.h> +#include <sched.h> +#include <stdbool.h> +#include <setjmp.h> + +#ifdef __x86_64__ +# define VSYS(x) (x) +#else +# define VSYS(x) 0 +#endif + +#ifndef SYS_getcpu +# ifdef __x86_64__ +# define SYS_getcpu 309 +# else +# define SYS_getcpu 318 +# endif +#endif + +static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *), + int flags) +{ + struct sigaction sa; + memset(&sa, 0, sizeof(sa)); + sa.sa_sigaction = handler; + sa.sa_flags = SA_SIGINFO | flags; + sigemptyset(&sa.sa_mask); + if (sigaction(sig, &sa, 0)) + err(1, "sigaction"); +} + +/* vsyscalls and vDSO */ +bool should_read_vsyscall = false; + +typedef long (*gtod_t)(struct timeval *tv, struct timezone *tz); +gtod_t vgtod = (gtod_t)VSYS(0xffffffffff600000); +gtod_t vdso_gtod; + +typedef int (*vgettime_t)(clockid_t, struct timespec *); +vgettime_t vdso_gettime; + +typedef long (*time_func_t)(time_t *t); +time_func_t vtime = (time_func_t)VSYS(0xffffffffff600400); +time_func_t vdso_time; + +typedef long (*getcpu_t)(unsigned *, unsigned *, void *); +getcpu_t vgetcpu = (getcpu_t)VSYS(0xffffffffff600800); +getcpu_t vdso_getcpu; + +static void init_vdso(void) +{ + void *vdso = dlopen("linux-vdso.so.1", RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD); + if (!vdso) + vdso = dlopen("linux-gate.so.1", RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD); + if (!vdso) { + printf("[WARN]\tfailed to find vDSO\n"); + return; + } + + vdso_gtod = (gtod_t)dlsym(vdso, "__vdso_gettimeofday"); + if (!vdso_gtod) + printf("[WARN]\tfailed to find gettimeofday in vDSO\n"); + + vdso_gettime = (vgettime_t)dlsym(vdso, "__vdso_clock_gettime"); + if (!vdso_gettime) + printf("[WARN]\tfailed to find clock_gettime in vDSO\n"); + + vdso_time = (time_func_t)dlsym(vdso, "__vdso_time"); + if (!vdso_time) + printf("[WARN]\tfailed to find time in vDSO\n"); + + vdso_getcpu = (getcpu_t)dlsym(vdso, "__vdso_getcpu"); + if (!vdso_getcpu) { + /* getcpu() was never wired up in the 32-bit vDSO. */ + printf("[%s]\tfailed to find getcpu in vDSO\n", + sizeof(long) == 8 ? "WARN" : "NOTE"); + } +} + +static int init_vsys(void) +{ +#ifdef __x86_64__ + int nerrs = 0; + FILE *maps; + char line[128]; + bool found = false; + + maps = fopen("/proc/self/maps", "r"); + if (!maps) { + printf("[WARN]\tCould not open /proc/self/maps -- assuming vsyscall is r-x\n"); + should_read_vsyscall = true; + return 0; + } + + while (fgets(line, sizeof(line), maps)) { + char r, x; + void *start, *end; + char name[128]; + if (sscanf(line, "%p-%p %c-%cp %*x %*x:%*x %*u %s", + &start, &end, &r, &x, name) != 5) + continue; + + if (strcmp(name, "[vsyscall]")) + continue; + + printf("\tvsyscall map: %s", line); + + if (start != (void *)0xffffffffff600000 || + end != (void *)0xffffffffff601000) { + printf("[FAIL]\taddress range is nonsense\n"); + nerrs++; + } + + printf("\tvsyscall permissions are %c-%c\n", r, x); + should_read_vsyscall = (r == 'r'); + if (x != 'x') { + vgtod = NULL; + vtime = NULL; + vgetcpu = NULL; + } + + found = true; + break; + } + + fclose(maps); + + if (!found) { + printf("\tno vsyscall map in /proc/self/maps\n"); + should_read_vsyscall = false; + vgtod = NULL; + vtime = NULL; + vgetcpu = NULL; + } + + return nerrs; +#else + return 0; +#endif +} + +/* syscalls */ +static inline long sys_gtod(struct timeval *tv, struct timezone *tz) +{ + return syscall(SYS_gettimeofday, tv, tz); +} + +static inline int sys_clock_gettime(clockid_t id, struct timespec *ts) +{ + return syscall(SYS_clock_gettime, id, ts); +} + +static inline long sys_time(time_t *t) +{ + return syscall(SYS_time, t); +} + +static inline long sys_getcpu(unsigned * cpu, unsigned * node, + void* cache) +{ + return syscall(SYS_getcpu, cpu, node, cache); +} + +static jmp_buf jmpbuf; + +static void sigsegv(int sig, siginfo_t *info, void *ctx_void) +{ + siglongjmp(jmpbuf, 1); +} + +static double tv_diff(const struct timeval *a, const struct timeval *b) +{ + return (double)(a->tv_sec - b->tv_sec) + + (double)((int)a->tv_usec - (int)b->tv_usec) * 1e-6; +} + +static int check_gtod(const struct timeval *tv_sys1, + const struct timeval *tv_sys2, + const struct timezone *tz_sys, + const char *which, + const struct timeval *tv_other, + const struct timezone *tz_other) +{ + int nerrs = 0; + double d1, d2; + + if (tz_other && (tz_sys->tz_minuteswest != tz_other->tz_minuteswest || tz_sys->tz_dsttime != tz_other->tz_dsttime)) { + printf("[FAIL] %s tz mismatch\n", which); + nerrs++; + } + + d1 = tv_diff(tv_other, tv_sys1); + d2 = tv_diff(tv_sys2, tv_other); + printf("\t%s time offsets: %lf %lf\n", which, d1, d2); + + if (d1 < 0 || d2 < 0) { + printf("[FAIL]\t%s time was inconsistent with the syscall\n", which); + nerrs++; + } else { + printf("[OK]\t%s gettimeofday()'s timeval was okay\n", which); + } + + return nerrs; +} + +static int test_gtod(void) +{ + struct timeval tv_sys1, tv_sys2, tv_vdso, tv_vsys; + struct timezone tz_sys, tz_vdso, tz_vsys; + long ret_vdso = -1; + long ret_vsys = -1; + int nerrs = 0; + + printf("[RUN]\ttest gettimeofday()\n"); + + if (sys_gtod(&tv_sys1, &tz_sys) != 0) + err(1, "syscall gettimeofday"); + if (vdso_gtod) + ret_vdso = vdso_gtod(&tv_vdso, &tz_vdso); + if (vgtod) + ret_vsys = vgtod(&tv_vsys, &tz_vsys); + if (sys_gtod(&tv_sys2, &tz_sys) != 0) + err(1, "syscall gettimeofday"); + + if (vdso_gtod) { + if (ret_vdso == 0) { + nerrs += check_gtod(&tv_sys1, &tv_sys2, &tz_sys, "vDSO", &tv_vdso, &tz_vdso); + } else { + printf("[FAIL]\tvDSO gettimeofday() failed: %ld\n", ret_vdso); + nerrs++; + } + } + + if (vgtod) { + if (ret_vsys == 0) { + nerrs += check_gtod(&tv_sys1, &tv_sys2, &tz_sys, "vsyscall", &tv_vsys, &tz_vsys); + } else { + printf("[FAIL]\tvsys gettimeofday() failed: %ld\n", ret_vsys); + nerrs++; + } + } + + return nerrs; +} + +static int test_time(void) { + int nerrs = 0; + + printf("[RUN]\ttest time()\n"); + long t_sys1, t_sys2, t_vdso = 0, t_vsys = 0; + long t2_sys1 = -1, t2_sys2 = -1, t2_vdso = -1, t2_vsys = -1; + t_sys1 = sys_time(&t2_sys1); + if (vdso_time) + t_vdso = vdso_time(&t2_vdso); + if (vtime) + t_vsys = vtime(&t2_vsys); + t_sys2 = sys_time(&t2_sys2); + if (t_sys1 < 0 || t_sys1 != t2_sys1 || t_sys2 < 0 || t_sys2 != t2_sys2) { + printf("[FAIL]\tsyscall failed (ret1:%ld output1:%ld ret2:%ld output2:%ld)\n", t_sys1, t2_sys1, t_sys2, t2_sys2); + nerrs++; + return nerrs; + } + + if (vdso_time) { + if (t_vdso < 0 || t_vdso != t2_vdso) { + printf("[FAIL]\tvDSO failed (ret:%ld output:%ld)\n", t_vdso, t2_vdso); + nerrs++; + } else if (t_vdso < t_sys1 || t_vdso > t_sys2) { + printf("[FAIL]\tvDSO returned the wrong time (%ld %ld %ld)\n", t_sys1, t_vdso, t_sys2); + nerrs++; + } else { + printf("[OK]\tvDSO time() is okay\n"); + } + } + + if (vtime) { + if (t_vsys < 0 || t_vsys != t2_vsys) { + printf("[FAIL]\tvsyscall failed (ret:%ld output:%ld)\n", t_vsys, t2_vsys); + nerrs++; + } else if (t_vsys < t_sys1 || t_vsys > t_sys2) { + printf("[FAIL]\tvsyscall returned the wrong time (%ld %ld %ld)\n", t_sys1, t_vsys, t_sys2); + nerrs++; + } else { + printf("[OK]\tvsyscall time() is okay\n"); + } + } + + return nerrs; +} + +static int test_getcpu(int cpu) +{ + int nerrs = 0; + long ret_sys, ret_vdso = -1, ret_vsys = -1; + + printf("[RUN]\tgetcpu() on CPU %d\n", cpu); + + cpu_set_t cpuset; + CPU_ZERO(&cpuset); + CPU_SET(cpu, &cpuset); + if (sched_setaffinity(0, sizeof(cpuset), &cpuset) != 0) { + printf("[SKIP]\tfailed to force CPU %d\n", cpu); + return nerrs; + } + + unsigned cpu_sys, cpu_vdso, cpu_vsys, node_sys, node_vdso, node_vsys; + unsigned node = 0; + bool have_node = false; + ret_sys = sys_getcpu(&cpu_sys, &node_sys, 0); + if (vdso_getcpu) + ret_vdso = vdso_getcpu(&cpu_vdso, &node_vdso, 0); + if (vgetcpu) + ret_vsys = vgetcpu(&cpu_vsys, &node_vsys, 0); + + if (ret_sys == 0) { + if (cpu_sys != cpu) { + printf("[FAIL]\tsyscall reported CPU %hu but should be %d\n", cpu_sys, cpu); + nerrs++; + } + + have_node = true; + node = node_sys; + } + + if (vdso_getcpu) { + if (ret_vdso) { + printf("[FAIL]\tvDSO getcpu() failed\n"); + nerrs++; + } else { + if (!have_node) { + have_node = true; + node = node_vdso; + } + + if (cpu_vdso != cpu) { + printf("[FAIL]\tvDSO reported CPU %hu but should be %d\n", cpu_vdso, cpu); + nerrs++; + } else { + printf("[OK]\tvDSO reported correct CPU\n"); + } + + if (node_vdso != node) { + printf("[FAIL]\tvDSO reported node %hu but should be %hu\n", node_vdso, node); + nerrs++; + } else { + printf("[OK]\tvDSO reported correct node\n"); + } + } + } + + if (vgetcpu) { + if (ret_vsys) { + printf("[FAIL]\tvsyscall getcpu() failed\n"); + nerrs++; + } else { + if (!have_node) { + have_node = true; + node = node_vsys; + } + + if (cpu_vsys != cpu) { + printf("[FAIL]\tvsyscall reported CPU %hu but should be %d\n", cpu_vsys, cpu); + nerrs++; + } else { + printf("[OK]\tvsyscall reported correct CPU\n"); + } + + if (node_vsys != node) { + printf("[FAIL]\tvsyscall reported node %hu but should be %hu\n", node_vsys, node); + nerrs++; + } else { + printf("[OK]\tvsyscall reported correct node\n"); + } + } + } + + return nerrs; +} + +static int test_vsys_r(void) +{ +#ifdef __x86_64__ + printf("[RUN]\tChecking read access to the vsyscall page\n"); + bool can_read; + if (sigsetjmp(jmpbuf, 1) == 0) { + *(volatile int *)0xffffffffff600000; + can_read = true; + } else { + can_read = false; + } + + if (can_read && !should_read_vsyscall) { + printf("[FAIL]\tWe have read access, but we shouldn't\n"); + return 1; + } else if (!can_read && should_read_vsyscall) { + printf("[FAIL]\tWe don't have read access, but we should\n"); + return 1; + } else { + printf("[OK]\tgot expected result\n"); + } +#endif + + return 0; +} + + +#ifdef __x86_64__ +#define X86_EFLAGS_TF (1UL << 8) +static volatile sig_atomic_t num_vsyscall_traps; + +static unsigned long get_eflags(void) +{ + unsigned long eflags; + asm volatile ("pushfq\n\tpopq %0" : "=rm" (eflags)); + return eflags; +} + +static void set_eflags(unsigned long eflags) +{ + asm volatile ("pushq %0\n\tpopfq" : : "rm" (eflags) : "flags"); +} + +static void sigtrap(int sig, siginfo_t *info, void *ctx_void) +{ + ucontext_t *ctx = (ucontext_t *)ctx_void; + unsigned long ip = ctx->uc_mcontext.gregs[REG_RIP]; + + if (((ip ^ 0xffffffffff600000UL) & ~0xfffUL) == 0) + num_vsyscall_traps++; +} + +static int test_native_vsyscall(void) +{ + time_t tmp; + bool is_native; + + if (!vtime) + return 0; + + printf("[RUN]\tchecking for native vsyscall\n"); + sethandler(SIGTRAP, sigtrap, 0); + set_eflags(get_eflags() | X86_EFLAGS_TF); + vtime(&tmp); + set_eflags(get_eflags() & ~X86_EFLAGS_TF); + + /* + * If vsyscalls are emulated, we expect a single trap in the + * vsyscall page -- the call instruction will trap with RIP + * pointing to the entry point before emulation takes over. + * In native mode, we expect two traps, since whatever code + * the vsyscall page contains will be more than just a ret + * instruction. + */ + is_native = (num_vsyscall_traps > 1); + + printf("\tvsyscalls are %s (%d instructions in vsyscall page)\n", + (is_native ? "native" : "emulated"), + (int)num_vsyscall_traps); + + return 0; +} +#endif + +int main(int argc, char **argv) +{ + int nerrs = 0; + + init_vdso(); + nerrs += init_vsys(); + + nerrs += test_gtod(); + nerrs += test_time(); + nerrs += test_getcpu(0); + nerrs += test_getcpu(1); + + sethandler(SIGSEGV, sigsegv, 0); + nerrs += test_vsys_r(); + +#ifdef __x86_64__ + nerrs += test_native_vsyscall(); +#endif + + return nerrs ? 1 : 0; +}

7 years, 8 months

1
0
0 0

Re: [Linux-stable-mirror] Linux 4.4.112

by Greg KH

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index b683e8ee69ec..ea6a043f5beb 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -271,3 +271,19 @@ Description: Parameters for the CPU cache attributes - WriteBack: data is written only to the cache line and the modified cache line is written to main memory only when it is replaced + +What: /sys/devices/system/cpu/vulnerabilities + /sys/devices/system/cpu/vulnerabilities/meltdown + /sys/devices/system/cpu/vulnerabilities/spectre_v1 + /sys/devices/system/cpu/vulnerabilities/spectre_v2 +Date: January 2018 +Contact: Linux kernel mailing list <linux-kernel(a)vger.kernel.org> +Description: Information about CPU vulnerabilities + + The files are named after the code names of CPU + vulnerabilities. The output of those files reflects the + state of the CPUs in the system. Possible output values: + + "Not affected" CPU is not affected by the vulnerability + "Vulnerable" CPU is affected and no mitigation in effect + "Mitigation: $M" CPU is affected and mitigation $M is in effect diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 5977c4d71356..39280b72f27a 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2523,8 +2523,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted. nojitter [IA-64] Disables jitter checking for ITC timers. - nopti [X86-64] Disable KAISER isolation of kernel from user. - no-kvmclock [X86,KVM] Disable paravirtualized KVM clock driver no-kvmapf [X86,KVM] Disable paravirtualized asynchronous page @@ -3056,11 +3054,20 @@ bytes respectively. Such letter suffixes can also be entirely omitted. pt. [PARIDE] See Documentation/blockdev/paride.txt. - pti= [X86_64] - Control KAISER user/kernel address space isolation: - on - enable - off - disable - auto - default setting + pti= [X86_64] Control Page Table Isolation of user and + kernel address spaces. Disabling this feature + removes hardening, but improves performance of + system calls and interrupts. + + on - unconditionally enable + off - unconditionally disable + auto - kernel detects whether your CPU model is + vulnerable to issues that PTI mitigates + + Not specifying this option is equivalent to pti=auto. + + nopti [X86_64] + Equivalent to pti=off pty.legacy_count= [KNL] Number of legacy pty's. Overwrites compiled-in diff --git a/Documentation/x86/pti.txt b/Documentation/x86/pti.txt new file mode 100644 index 000000000000..d11eff61fc9a --- /dev/null +++ b/Documentation/x86/pti.txt @@ -0,0 +1,186 @@ +Overview +======== + +Page Table Isolation (pti, previously known as KAISER[1]) is a +countermeasure against attacks on the shared user/kernel address +space such as the "Meltdown" approach[2]. + +To mitigate this class of attacks, we create an independent set of +page tables for use only when running userspace applications. When +the kernel is entered via syscalls, interrupts or exceptions, the +page tables are switched to the full "kernel" copy. When the system +switches back to user mode, the user copy is used again. + +The userspace page tables contain only a minimal amount of kernel +data: only what is needed to enter/exit the kernel such as the +entry/exit functions themselves and the interrupt descriptor table +(IDT). There are a few strictly unnecessary things that get mapped +such as the first C function when entering an interrupt (see +comments in pti.c). + +This approach helps to ensure that side-channel attacks leveraging +the paging structures do not function when PTI is enabled. It can be +enabled by setting CONFIG_PAGE_TABLE_ISOLATION=y at compile time. +Once enabled at compile-time, it can be disabled at boot with the +'nopti' or 'pti=' kernel parameters (see kernel-parameters.txt). + +Page Table Management +===================== + +When PTI is enabled, the kernel manages two sets of page tables. +The first set is very similar to the single set which is present in +kernels without PTI. This includes a complete mapping of userspace +that the kernel can use for things like copy_to_user(). + +Although _complete_, the user portion of the kernel page tables is +crippled by setting the NX bit in the top level. This ensures +that any missed kernel->user CR3 switch will immediately crash +userspace upon executing its first instruction. + +The userspace page tables map only the kernel data needed to enter +and exit the kernel. This data is entirely contained in the 'struct +cpu_entry_area' structure which is placed in the fixmap which gives +each CPU's copy of the area a compile-time-fixed virtual address. + +For new userspace mappings, the kernel makes the entries in its +page tables like normal. The only difference is when the kernel +makes entries in the top (PGD) level. In addition to setting the +entry in the main kernel PGD, a copy of the entry is made in the +userspace page tables' PGD. + +This sharing at the PGD level also inherently shares all the lower +layers of the page tables. This leaves a single, shared set of +userspace page tables to manage. One PTE to lock, one set of +accessed bits, dirty bits, etc... + +Overhead +======== + +Protection against side-channel attacks is important. But, +this protection comes at a cost: + +1. Increased Memory Use + a. Each process now needs an order-1 PGD instead of order-0. + (Consumes an additional 4k per process). + b. The 'cpu_entry_area' structure must be 2MB in size and 2MB + aligned so that it can be mapped by setting a single PMD + entry. This consumes nearly 2MB of RAM once the kernel + is decompressed, but no space in the kernel image itself. + +2. Runtime Cost + a. CR3 manipulation to switch between the page table copies + must be done at interrupt, syscall, and exception entry + and exit (it can be skipped when the kernel is interrupted, + though.) Moves to CR3 are on the order of a hundred + cycles, and are required at every entry and exit. + b. A "trampoline" must be used for SYSCALL entry. This + trampoline depends on a smaller set of resources than the + non-PTI SYSCALL entry code, so requires mapping fewer + things into the userspace page tables. The downside is + that stacks must be switched at entry time. + d. Global pages are disabled for all kernel structures not + mapped into both kernel and userspace page tables. This + feature of the MMU allows different processes to share TLB + entries mapping the kernel. Losing the feature means more + TLB misses after a context switch. The actual loss of + performance is very small, however, never exceeding 1%. + d. Process Context IDentifiers (PCID) is a CPU feature that + allows us to skip flushing the entire TLB when switching page + tables by setting a special bit in CR3 when the page tables + are changed. This makes switching the page tables (at context + switch, or kernel entry/exit) cheaper. But, on systems with + PCID support, the context switch code must flush both the user + and kernel entries out of the TLB. The user PCID TLB flush is + deferred until the exit to userspace, minimizing the cost. + See intel.com/sdm for the gory PCID/INVPCID details. + e. The userspace page tables must be populated for each new + process. Even without PTI, the shared kernel mappings + are created by copying top-level (PGD) entries into each + new process. But, with PTI, there are now *two* kernel + mappings: one in the kernel page tables that maps everything + and one for the entry/exit structures. At fork(), we need to + copy both. + f. In addition to the fork()-time copying, there must also + be an update to the userspace PGD any time a set_pgd() is done + on a PGD used to map userspace. This ensures that the kernel + and userspace copies always map the same userspace + memory. + g. On systems without PCID support, each CR3 write flushes + the entire TLB. That means that each syscall, interrupt + or exception flushes the TLB. + h. INVPCID is a TLB-flushing instruction which allows flushing + of TLB entries for non-current PCIDs. Some systems support + PCIDs, but do not support INVPCID. On these systems, addresses + can only be flushed from the TLB for the current PCID. When + flushing a kernel address, we need to flush all PCIDs, so a + single kernel address flush will require a TLB-flushing CR3 + write upon the next use of every PCID. + +Possible Future Work +==================== +1. We can be more careful about not actually writing to CR3 + unless its value is actually changed. +2. Allow PTI to be enabled/disabled at runtime in addition to the + boot-time switching. + +Testing +======== + +To test stability of PTI, the following test procedure is recommended, +ideally doing all of these in parallel: + +1. Set CONFIG_DEBUG_ENTRY=y +2. Run several copies of all of the tools/testing/selftests/x86/ tests + (excluding MPX and protection_keys) in a loop on multiple CPUs for + several minutes. These tests frequently uncover corner cases in the + kernel entry code. In general, old kernels might cause these tests + themselves to crash, but they should never crash the kernel. +3. Run the 'perf' tool in a mode (top or record) that generates many + frequent performance monitoring non-maskable interrupts (see "NMI" + in /proc/interrupts). This exercises the NMI entry/exit code which + is known to trigger bugs in code paths that did not expect to be + interrupted, including nested NMIs. Using "-c" boosts the rate of + NMIs, and using two -c with separate counters encourages nested NMIs + and less deterministic behavior. + + while true; do perf record -c 10000 -e instructions,cycles -a sleep 10; done + +4. Launch a KVM virtual machine. +5. Run 32-bit binaries on systems supporting the SYSCALL instruction. + This has been a lightly-tested code path and needs extra scrutiny. + +Debugging +========= + +Bugs in PTI cause a few different signatures of crashes +that are worth noting here. + + * Failures of the selftests/x86 code. Usually a bug in one of the + more obscure corners of entry_64.S + * Crashes in early boot, especially around CPU bringup. Bugs + in the trampoline code or mappings cause these. + * Crashes at the first interrupt. Caused by bugs in entry_64.S, + like screwing up a page table switch. Also caused by + incorrectly mapping the IRQ handler entry code. + * Crashes at the first NMI. The NMI code is separate from main + interrupt handlers and can have bugs that do not affect + normal interrupts. Also caused by incorrectly mapping NMI + code. NMIs that interrupt the entry code must be very + careful and can be the cause of crashes that show up when + running perf. + * Kernel crashes at the first exit to userspace. entry_64.S + bugs, or failing to map some of the exit code. + * Crashes at first interrupt that interrupts userspace. The paths + in entry_64.S that return to userspace are sometimes separate + from the ones that return to the kernel. + * Double faults: overflowing the kernel stack because of page + faults upon page faults. Caused by touching non-pti-mapped + data in the entry code, or forgetting to switch to kernel + CR3 before calling into C functions which are not pti-mapped. + * Userspace segfaults early in boot, sometimes manifesting + as mount(8) failing to mount the rootfs. These have + tended to be TLB invalidation issues. Usually invalidating + the wrong PCID, or otherwise missing an invalidation. + +1. https://gruss.cc/files/kaiser.pdf +2. https://meltdownattack.com/meltdown.pdf diff --git a/Makefile b/Makefile index 4779517d9bf0..07070a1e6292 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ VERSION = 4 PATCHLEVEL = 4 -SUBLEVEL = 111 +SUBLEVEL = 112 EXTRAVERSION = NAME = Blurry Fish Butt diff --git a/arch/arm/kvm/mmio.c b/arch/arm/kvm/mmio.c index 3a10c9f1d0a4..387ee2a11e36 100644 --- a/arch/arm/kvm/mmio.c +++ b/arch/arm/kvm/mmio.c @@ -113,7 +113,7 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct kvm_run *run) } trace_kvm_mmio(KVM_TRACE_MMIO_READ, len, run->mmio.phys_addr, - data); + &data); data = vcpu_data_host_to_guest(vcpu, data, len); vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data); } @@ -189,14 +189,14 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run, data = vcpu_data_guest_to_host(vcpu, vcpu_get_reg(vcpu, rt), len); - trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, data); + trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, &data); mmio_write_buf(data_buf, len, data); ret = kvm_io_bus_write(vcpu, KVM_MMIO_BUS, fault_ipa, len, data_buf); } else { trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, len, - fault_ipa, 0); + fault_ipa, NULL); ret = kvm_io_bus_read(vcpu, KVM_MMIO_BUS, fault_ipa, len, data_buf); diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c index 163b3449a8de..fcbc4e57d765 100644 --- a/arch/mips/kernel/process.c +++ b/arch/mips/kernel/process.c @@ -664,6 +664,18 @@ int mips_set_process_fp_mode(struct task_struct *task, unsigned int value) unsigned long switch_count; struct task_struct *t; + /* If nothing to change, return right away, successfully. */ + if (value == mips_get_process_fp_mode(task)) + return 0; + + /* Only accept a mode change if 64-bit FP enabled for o32. */ + if (!IS_ENABLED(CONFIG_MIPS_O32_FP64_SUPPORT)) + return -EOPNOTSUPP; + + /* And only for o32 tasks. */ + if (IS_ENABLED(CONFIG_64BIT) && !test_thread_flag(TIF_32BIT_REGS)) + return -EOPNOTSUPP; + /* Check the value is valid */ if (value & ~known_bits) return -EOPNOTSUPP; diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c index a3f38e6b7ea1..c3d2d2c05fdb 100644 --- a/arch/mips/kernel/ptrace.c +++ b/arch/mips/kernel/ptrace.c @@ -439,63 +439,160 @@ static int gpr64_set(struct task_struct *target, #endif /* CONFIG_64BIT */ +/* + * Copy the floating-point context to the supplied NT_PRFPREG buffer, + * !CONFIG_CPU_HAS_MSA variant. FP context's general register slots + * correspond 1:1 to buffer slots. Only general registers are copied. + */ +static int fpr_get_fpa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + void **kbuf, void __user **ubuf) +{ + return user_regset_copyout(pos, count, kbuf, ubuf, + &target->thread.fpu, + 0, NUM_FPU_REGS * sizeof(elf_fpreg_t)); +} + +/* + * Copy the floating-point context to the supplied NT_PRFPREG buffer, + * CONFIG_CPU_HAS_MSA variant. Only lower 64 bits of FP context's + * general register slots are copied to buffer slots. Only general + * registers are copied. + */ +static int fpr_get_msa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + void **kbuf, void __user **ubuf) +{ + unsigned int i; + u64 fpr_val; + int err; + + BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); + for (i = 0; i < NUM_FPU_REGS; i++) { + fpr_val = get_fpr64(&target->thread.fpu.fpr[i], 0); + err = user_regset_copyout(pos, count, kbuf, ubuf, + &fpr_val, i * sizeof(elf_fpreg_t), + (i + 1) * sizeof(elf_fpreg_t)); + if (err) + return err; + } + + return 0; +} + +/* + * Copy the floating-point context to the supplied NT_PRFPREG buffer. + * Choose the appropriate helper for general registers, and then copy + * the FCSR register separately. + */ static int fpr_get(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, void *kbuf, void __user *ubuf) { - unsigned i; + const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t); int err; - u64 fpr_val; - /* XXX fcr31 */ + if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t)) + err = fpr_get_fpa(target, &pos, &count, &kbuf, &ubuf); + else + err = fpr_get_msa(target, &pos, &count, &kbuf, &ubuf); + if (err) + return err; - if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t)) - return user_regset_copyout(&pos, &count, &kbuf, &ubuf, - &target->thread.fpu, - 0, sizeof(elf_fpregset_t)); + err = user_regset_copyout(&pos, &count, &kbuf, &ubuf, + &target->thread.fpu.fcr31, + fcr31_pos, fcr31_pos + sizeof(u32)); - for (i = 0; i < NUM_FPU_REGS; i++) { - fpr_val = get_fpr64(&target->thread.fpu.fpr[i], 0); - err = user_regset_copyout(&pos, &count, &kbuf, &ubuf, - &fpr_val, i * sizeof(elf_fpreg_t), - (i + 1) * sizeof(elf_fpreg_t)); + return err; +} + +/* + * Copy the supplied NT_PRFPREG buffer to the floating-point context, + * !CONFIG_CPU_HAS_MSA variant. Buffer slots correspond 1:1 to FP + * context's general register slots. Only general registers are copied. + */ +static int fpr_set_fpa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + const void **kbuf, const void __user **ubuf) +{ + return user_regset_copyin(pos, count, kbuf, ubuf, + &target->thread.fpu, + 0, NUM_FPU_REGS * sizeof(elf_fpreg_t)); +} + +/* + * Copy the supplied NT_PRFPREG buffer to the floating-point context, + * CONFIG_CPU_HAS_MSA variant. Buffer slots are copied to lower 64 + * bits only of FP context's general register slots. Only general + * registers are copied. + */ +static int fpr_set_msa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + const void **kbuf, const void __user **ubuf) +{ + unsigned int i; + u64 fpr_val; + int err; + + BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); + for (i = 0; i < NUM_FPU_REGS && *count > 0; i++) { + err = user_regset_copyin(pos, count, kbuf, ubuf, + &fpr_val, i * sizeof(elf_fpreg_t), + (i + 1) * sizeof(elf_fpreg_t)); if (err) return err; + set_fpr64(&target->thread.fpu.fpr[i], 0, fpr_val); } return 0; } +/* + * Copy the supplied NT_PRFPREG buffer to the floating-point context. + * Choose the appropriate helper for general registers, and then copy + * the FCSR register separately. + * + * We optimize for the case where `count % sizeof(elf_fpreg_t) == 0', + * which is supposed to have been guaranteed by the kernel before + * calling us, e.g. in `ptrace_regset'. We enforce that requirement, + * so that we can safely avoid preinitializing temporaries for + * partial register writes. + */ static int fpr_set(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, const void *kbuf, const void __user *ubuf) { - unsigned i; + const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t); + u32 fcr31; int err; - u64 fpr_val; - /* XXX fcr31 */ + BUG_ON(count % sizeof(elf_fpreg_t)); + + if (pos + count > sizeof(elf_fpregset_t)) + return -EIO; init_fp_ctx(target); - if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t)) - return user_regset_copyin(&pos, &count, &kbuf, &ubuf, - &target->thread.fpu, - 0, sizeof(elf_fpregset_t)); + if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t)) + err = fpr_set_fpa(target, &pos, &count, &kbuf, &ubuf); + else + err = fpr_set_msa(target, &pos, &count, &kbuf, &ubuf); + if (err) + return err; - BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); - for (i = 0; i < NUM_FPU_REGS && count >= sizeof(elf_fpreg_t); i++) { + if (count > 0) { err = user_regset_copyin(&pos, &count, &kbuf, &ubuf, - &fpr_val, i * sizeof(elf_fpreg_t), - (i + 1) * sizeof(elf_fpreg_t)); + &fcr31, + fcr31_pos, fcr31_pos + sizeof(u32)); if (err) return err; - set_fpr64(&target->thread.fpu.fpr[i], 0, fpr_val); + + ptrace_setfcr31(target, fcr31); } - return 0; + return err; } enum mips_regset { diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 39d2dc66faa5..0ef2cdd11616 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -62,6 +62,7 @@ config X86 select GENERIC_CLOCKEVENTS_MIN_ADJUST select GENERIC_CMOS_UPDATE select GENERIC_CPU_AUTOPROBE + select GENERIC_CPU_VULNERABILITIES select GENERIC_EARLY_IOREMAP select GENERIC_FIND_FIRST_BIT select GENERIC_IOMAP diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 09936e9c8154..d1cf17173b1b 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -138,7 +138,7 @@ static inline int alternatives_text_reserved(void *start, void *end) ".popsection\n" \ ".pushsection .altinstr_replacement, \"ax\"\n" \ ALTINSTR_REPLACEMENT(newinstr, feature, 1) \ - ".popsection" + ".popsection\n" #define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2)\ OLDINSTR_2(oldinstr, 1, 2) \ @@ -149,7 +149,7 @@ static inline int alternatives_text_reserved(void *start, void *end) ".pushsection .altinstr_replacement, \"ax\"\n" \ ALTINSTR_REPLACEMENT(newinstr1, feature1, 1) \ ALTINSTR_REPLACEMENT(newinstr2, feature2, 2) \ - ".popsection" + ".popsection\n" /* * This must be included *after* the definition of ALTERNATIVE due to diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index f6605712ca90..142028afd049 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -277,6 +277,9 @@ #define X86_BUG_FXSAVE_LEAK X86_BUG(6) /* FXSAVE leaks FOP/FIP/FOP */ #define X86_BUG_CLFLUSH_MONITOR X86_BUG(7) /* AAI65, CLFLUSH required before MONITOR */ #define X86_BUG_SYSRET_SS_ATTRS X86_BUG(8) /* SYSRET doesn't fix up SS attrs */ +#define X86_BUG_CPU_MELTDOWN X86_BUG(14) /* CPU is affected by meltdown attack and needs kernel page table isolation */ +#define X86_BUG_SPECTRE_V1 X86_BUG(15) /* CPU is affected by Spectre variant 1 attack with conditional branches */ +#define X86_BUG_SPECTRE_V2 X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */ #if defined(__KERNEL__) && !defined(__ASSEMBLY__) @@ -359,6 +362,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; set_bit(bit, (unsigned long *)cpu_caps_set); \ } while (0) +#define setup_force_cpu_bug(bit) setup_force_cpu_cap(bit) + #define cpu_has_fpu boot_cpu_has(X86_FEATURE_FPU) #define cpu_has_de boot_cpu_has(X86_FEATURE_DE) #define cpu_has_pse boot_cpu_has(X86_FEATURE_PSE) diff --git a/arch/x86/include/asm/kaiser.h b/arch/x86/include/asm/kaiser.h index 802bbbdfe143..48c791a411ab 100644 --- a/arch/x86/include/asm/kaiser.h +++ b/arch/x86/include/asm/kaiser.h @@ -19,6 +19,16 @@ #define KAISER_SHADOW_PGD_OFFSET 0x1000 +#ifdef CONFIG_PAGE_TABLE_ISOLATION +/* + * A page table address must have this alignment to stay the same when + * KAISER_SHADOW_PGD_OFFSET mask is applied + */ +#define KAISER_KERNEL_PGD_ALIGNMENT (KAISER_SHADOW_PGD_OFFSET << 1) +#else +#define KAISER_KERNEL_PGD_ALIGNMENT PAGE_SIZE +#endif + #ifdef __ASSEMBLY__ #ifdef CONFIG_PAGE_TABLE_ISOLATION diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index f3bdaed0188f..c124d6ab4bf9 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -156,8 +156,8 @@ extern struct cpuinfo_x86 boot_cpu_data; extern struct cpuinfo_x86 new_cpu_data; extern struct tss_struct doublefault_tss; -extern __u32 cpu_caps_cleared[NCAPINTS]; -extern __u32 cpu_caps_set[NCAPINTS]; +extern __u32 cpu_caps_cleared[NCAPINTS + NBUGINTS]; +extern __u32 cpu_caps_set[NCAPINTS + NBUGINTS]; #ifdef CONFIG_SMP DECLARE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info); diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h index 6045cef376c2..c926255745e1 100644 --- a/arch/x86/include/asm/pvclock.h +++ b/arch/x86/include/asm/pvclock.h @@ -4,7 +4,7 @@ #include <linux/clocksource.h> #include <asm/pvclock-abi.h> -#ifdef CONFIG_PARAVIRT_CLOCK +#ifdef CONFIG_KVM_GUEST extern struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void); #else static inline struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void) diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index 1e5eb9f2ff5f..a1e4a6c3f394 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -321,13 +321,12 @@ acpi_parse_lapic_nmi(struct acpi_subtable_header * header, const unsigned long e #ifdef CONFIG_X86_IO_APIC #define MP_ISA_BUS 0 +static int __init mp_register_ioapic_irq(u8 bus_irq, u8 polarity, + u8 trigger, u32 gsi); + static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, u32 gsi) { - int ioapic; - int pin; - struct mpc_intsrc mp_irq; - /* * Check bus_irq boundary. */ @@ -336,14 +335,6 @@ static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, return; } - /* - * Convert 'gsi' to 'ioapic.pin'. - */ - ioapic = mp_find_ioapic(gsi); - if (ioapic < 0) - return; - pin = mp_find_ioapic_pin(ioapic, gsi); - /* * TBD: This check is for faulty timer entries, where the override * erroneously sets the trigger to level, resulting in a HUGE @@ -352,16 +343,8 @@ static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, if ((bus_irq == 0) && (trigger == 3)) trigger = 1; - mp_irq.type = MP_INTSRC; - mp_irq.irqtype = mp_INT; - mp_irq.irqflag = (trigger << 2) | polarity; - mp_irq.srcbus = MP_ISA_BUS; - mp_irq.srcbusirq = bus_irq; /* IRQ */ - mp_irq.dstapic = mpc_ioapic_id(ioapic); /* APIC ID */ - mp_irq.dstirq = pin; /* INTIN# */ - - mp_save_irq(&mp_irq); - + if (mp_register_ioapic_irq(bus_irq, polarity, trigger, gsi) < 0) + return; /* * Reset default identity mapping if gsi is also an legacy IRQ, * otherwise there will be more than one entry with the same GSI @@ -408,6 +391,34 @@ static int mp_config_acpi_gsi(struct device *dev, u32 gsi, int trigger, return 0; } +static int __init mp_register_ioapic_irq(u8 bus_irq, u8 polarity, + u8 trigger, u32 gsi) +{ + struct mpc_intsrc mp_irq; + int ioapic, pin; + + /* Convert 'gsi' to 'ioapic.pin'(INTIN#) */ + ioapic = mp_find_ioapic(gsi); + if (ioapic < 0) { + pr_warn("Failed to find ioapic for gsi : %u\n", gsi); + return ioapic; + } + + pin = mp_find_ioapic_pin(ioapic, gsi); + + mp_irq.type = MP_INTSRC; + mp_irq.irqtype = mp_INT; + mp_irq.irqflag = (trigger << 2) | polarity; + mp_irq.srcbus = MP_ISA_BUS; + mp_irq.srcbusirq = bus_irq; + mp_irq.dstapic = mpc_ioapic_id(ioapic); + mp_irq.dstirq = pin; + + mp_save_irq(&mp_irq); + + return 0; +} + static int __init acpi_parse_ioapic(struct acpi_subtable_header * header, const unsigned long end) { @@ -452,7 +463,11 @@ static void __init acpi_sci_ioapic_setup(u8 bus_irq, u16 polarity, u16 trigger, if (acpi_sci_flags & ACPI_MADT_POLARITY_MASK) polarity = acpi_sci_flags & ACPI_MADT_POLARITY_MASK; - mp_override_legacy_irq(bus_irq, polarity, trigger, gsi); + if (bus_irq < NR_IRQS_LEGACY) + mp_override_legacy_irq(bus_irq, polarity, trigger, gsi); + else + mp_register_ioapic_irq(bus_irq, polarity, trigger, gsi); + acpi_penalize_sci_irq(bus_irq, trigger, polarity); /* diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 25f909362b7a..d6f375f1b928 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -339,9 +339,12 @@ done: static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr) { unsigned long flags; + int i; - if (instr[0] != 0x90) - return; + for (i = 0; i < a->padlen; i++) { + if (instr[i] != 0x90) + return; + } local_irq_save(flags); add_nops(instr + (a->instrlen - a->padlen), a->padlen); diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index 58031303e304..8f184615053b 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -16,13 +16,11 @@ obj-y := intel_cacheinfo.o scattered.o topology.o obj-y += common.o obj-y += rdrand.o obj-y += match.o +obj-y += bugs.o obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_X86_FEATURE_NAMES) += capflags.o powerflags.o -obj-$(CONFIG_X86_32) += bugs.o -obj-$(CONFIG_X86_64) += bugs_64.o - obj-$(CONFIG_CPU_SUP_INTEL) += intel.o obj-$(CONFIG_CPU_SUP_AMD) += amd.o obj-$(CONFIG_CPU_SUP_CYRIX_32) += cyrix.o diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 0b6124315441..cd46f9039119 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -9,6 +9,7 @@ */ #include <linux/init.h> #include <linux/utsname.h> +#include <linux/cpu.h> #include <asm/bugs.h> #include <asm/processor.h> #include <asm/processor-flags.h> @@ -16,6 +17,8 @@ #include <asm/msr.h> #include <asm/paravirt.h> #include <asm/alternative.h> +#include <asm/pgtable.h> +#include <asm/cacheflush.h> void __init check_bugs(void) { @@ -28,11 +31,13 @@ void __init check_bugs(void) #endif identify_boot_cpu(); -#ifndef CONFIG_SMP - pr_info("CPU: "); - print_cpu_info(&boot_cpu_data); -#endif + if (!IS_ENABLED(CONFIG_SMP)) { + pr_info("CPU: "); + print_cpu_info(&boot_cpu_data); + } + +#ifdef CONFIG_X86_32 /* * Check whether we are able to run this kernel safely on SMP. * @@ -48,4 +53,46 @@ void __init check_bugs(void) alternative_instructions(); fpu__init_check_bugs(); +#else /* CONFIG_X86_64 */ + alternative_instructions(); + + /* + * Make sure the first 2MB area is not mapped by huge pages + * There are typically fixed size MTRRs in there and overlapping + * MTRRs into large pages causes slow downs. + * + * Right now we don't do that with gbpages because there seems + * very little benefit for that case. + */ + if (!direct_gbpages) + set_memory_4k((unsigned long)__va(0), 1); +#endif } + +#ifdef CONFIG_SYSFS +ssize_t cpu_show_meltdown(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN)) + return sprintf(buf, "Not affected\n"); + if (boot_cpu_has(X86_FEATURE_KAISER)) + return sprintf(buf, "Mitigation: PTI\n"); + return sprintf(buf, "Vulnerable\n"); +} + +ssize_t cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1)) + return sprintf(buf, "Not affected\n"); + return sprintf(buf, "Vulnerable\n"); +} + +ssize_t cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) + return sprintf(buf, "Not affected\n"); + return sprintf(buf, "Vulnerable\n"); +} +#endif diff --git a/arch/x86/kernel/cpu/bugs_64.c b/arch/x86/kernel/cpu/bugs_64.c deleted file mode 100644 index 04f0fe5af83e..000000000000 --- a/arch/x86/kernel/cpu/bugs_64.c +++ /dev/null @@ -1,33 +0,0 @@ -/* - * Copyright (C) 1994 Linus Torvalds - * Copyright (C) 2000 SuSE - */ - -#include <linux/kernel.h> -#include <linux/init.h> -#include <asm/alternative.h> -#include <asm/bugs.h> -#include <asm/processor.h> -#include <asm/mtrr.h> -#include <asm/cacheflush.h> - -void __init check_bugs(void) -{ - identify_boot_cpu(); -#if !defined(CONFIG_SMP) - printk(KERN_INFO "CPU: "); - print_cpu_info(&boot_cpu_data); -#endif - alternative_instructions(); - - /* - * Make sure the first 2MB area is not mapped by huge pages - * There are typically fixed size MTRRs in there and overlapping - * MTRRs into large pages causes slow downs. - * - * Right now we don't do that with gbpages because there seems - * very little benefit for that case. - */ - if (!direct_gbpages) - set_memory_4k((unsigned long)__va(0), 1); -} diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index cc154ac64f00..dc4dfad66a70 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -432,8 +432,8 @@ static const char *table_lookup_model(struct cpuinfo_x86 *c) return NULL; /* Not found */ } -__u32 cpu_caps_cleared[NCAPINTS]; -__u32 cpu_caps_set[NCAPINTS]; +__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS]; +__u32 cpu_caps_set[NCAPINTS + NBUGINTS]; void load_percpu_segment(int cpu) { @@ -664,6 +664,16 @@ void cpu_detect(struct cpuinfo_x86 *c) } } +static void apply_forced_caps(struct cpuinfo_x86 *c) +{ + int i; + + for (i = 0; i < NCAPINTS + NBUGINTS; i++) { + c->x86_capability[i] &= ~cpu_caps_cleared[i]; + c->x86_capability[i] |= cpu_caps_set[i]; + } +} + void get_cpu_cap(struct cpuinfo_x86 *c) { u32 tfms, xlvl; @@ -820,6 +830,13 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c) } setup_force_cpu_cap(X86_FEATURE_ALWAYS); + + /* Assume for now that ALL x86 CPUs are insecure */ + setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN); + + setup_force_cpu_bug(X86_BUG_SPECTRE_V1); + setup_force_cpu_bug(X86_BUG_SPECTRE_V2); + fpu__init_system(c); } @@ -955,11 +972,8 @@ static void identify_cpu(struct cpuinfo_x86 *c) if (this_cpu->c_identify) this_cpu->c_identify(c); - /* Clear/Set all flags overriden by options, after probe */ - for (i = 0; i < NCAPINTS; i++) { - c->x86_capability[i] &= ~cpu_caps_cleared[i]; - c->x86_capability[i] |= cpu_caps_set[i]; - } + /* Clear/Set all flags overridden by options, after probe */ + apply_forced_caps(c); #ifdef CONFIG_X86_64 c->apicid = apic->phys_pkg_id(c->initial_apicid, 0); @@ -1020,10 +1034,7 @@ static void identify_cpu(struct cpuinfo_x86 *c) * Clear/Set all flags overriden by options, need do it * before following smp all cpus cap AND. */ - for (i = 0; i < NCAPINTS; i++) { - c->x86_capability[i] &= ~cpu_caps_cleared[i]; - c->x86_capability[i] |= cpu_caps_set[i]; - } + apply_forced_caps(c); /* * On SMP, boot_cpu_data holds the common feature set between diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c index abf581ade8d2..b428a8174be1 100644 --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -994,9 +994,17 @@ static bool is_blacklisted(unsigned int cpu) { struct cpuinfo_x86 *c = &cpu_data(cpu); - if (c->x86 == 6 && c->x86_model == 79) { - pr_err_once("late loading on model 79 is disabled.\n"); - return true; + /* + * Late loading on model 79 with microcode revision less than 0x0b000021 + * may result in a system hang. This behavior is documented in item + * BDF90, #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family). + */ + if (c->x86 == 6 && + c->x86_model == 79 && + c->x86_mask == 0x01 && + c->microcode < 0x0b000021) { + pr_err_once("Erratum BDF90: late loading with revision < 0x0b000021 (0x%x) disabled.\n", c->microcode); + pr_err_once("Please consider either early loading through initrd/built-in or a potential BIOS update.\n"); } return false; diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 4b1152e57340..900ffb6c28b5 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -3855,6 +3855,25 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) "mov %%r13, %c[r13](%[svm]) \n\t" "mov %%r14, %c[r14](%[svm]) \n\t" "mov %%r15, %c[r15](%[svm]) \n\t" +#endif + /* + * Clear host registers marked as clobbered to prevent + * speculative use. + */ + "xor %%" _ASM_BX ", %%" _ASM_BX " \n\t" + "xor %%" _ASM_CX ", %%" _ASM_CX " \n\t" + "xor %%" _ASM_DX ", %%" _ASM_DX " \n\t" + "xor %%" _ASM_SI ", %%" _ASM_SI " \n\t" + "xor %%" _ASM_DI ", %%" _ASM_DI " \n\t" +#ifdef CONFIG_X86_64 + "xor %%r8, %%r8 \n\t" + "xor %%r9, %%r9 \n\t" + "xor %%r10, %%r10 \n\t" + "xor %%r11, %%r11 \n\t" + "xor %%r12, %%r12 \n\t" + "xor %%r13, %%r13 \n\t" + "xor %%r14, %%r14 \n\t" + "xor %%r15, %%r15 \n\t" #endif "pop %%" _ASM_BP : diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index d915185ada05..c26255f19603 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -828,8 +828,16 @@ static inline short vmcs_field_to_offset(unsigned long field) { BUILD_BUG_ON(ARRAY_SIZE(vmcs_field_to_offset_table) > SHRT_MAX); - if (field >= ARRAY_SIZE(vmcs_field_to_offset_table) || - vmcs_field_to_offset_table[field] == 0) + if (field >= ARRAY_SIZE(vmcs_field_to_offset_table)) + return -ENOENT; + + /* + * FIXME: Mitigation for CVE-2017-5753. To be replaced with a + * generic mechanism. + */ + asm("lfence"); + + if (vmcs_field_to_offset_table[field] == 0) return -ENOENT; return vmcs_field_to_offset_table[field]; @@ -8623,6 +8631,7 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) /* Save guest registers, load host registers, keep flags */ "mov %0, %c[wordsize](%%" _ASM_SP ") \n\t" "pop %0 \n\t" + "setbe %c[fail](%0)\n\t" "mov %%" _ASM_AX ", %c[rax](%0) \n\t" "mov %%" _ASM_BX ", %c[rbx](%0) \n\t" __ASM_SIZE(pop) " %c[rcx](%0) \n\t" @@ -8639,12 +8648,23 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) "mov %%r13, %c[r13](%0) \n\t" "mov %%r14, %c[r14](%0) \n\t" "mov %%r15, %c[r15](%0) \n\t" + "xor %%r8d, %%r8d \n\t" + "xor %%r9d, %%r9d \n\t" + "xor %%r10d, %%r10d \n\t" + "xor %%r11d, %%r11d \n\t" + "xor %%r12d, %%r12d \n\t" + "xor %%r13d, %%r13d \n\t" + "xor %%r14d, %%r14d \n\t" + "xor %%r15d, %%r15d \n\t" #endif "mov %%cr2, %%" _ASM_AX " \n\t" "mov %%" _ASM_AX ", %c[cr2](%0) \n\t" + "xor %%eax, %%eax \n\t" + "xor %%ebx, %%ebx \n\t" + "xor %%esi, %%esi \n\t" + "xor %%edi, %%edi \n\t" "pop %%" _ASM_BP "; pop %%" _ASM_DX " \n\t" - "setbe %c[fail](%0) \n\t" ".pushsection .rodata \n\t" ".global vmx_return \n\t" "vmx_return: " _ASM_PTR " 2b \n\t" diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ccf17dbfea09..f973cfa8ff4f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4114,7 +4114,7 @@ static int vcpu_mmio_read(struct kvm_vcpu *vcpu, gpa_t addr, int len, void *v) addr, n, v)) && kvm_io_bus_read(vcpu, KVM_MMIO_BUS, addr, n, v)) break; - trace_kvm_mmio(KVM_TRACE_MMIO_READ, n, addr, *(u64 *)v); + trace_kvm_mmio(KVM_TRACE_MMIO_READ, n, addr, v); handled += n; addr += n; len -= n; @@ -4362,7 +4362,7 @@ static int read_prepare(struct kvm_vcpu *vcpu, void *val, int bytes) { if (vcpu->mmio_read_completed) { trace_kvm_mmio(KVM_TRACE_MMIO_READ, bytes, - vcpu->mmio_fragments[0].gpa, *(u64 *)val); + vcpu->mmio_fragments[0].gpa, val); vcpu->mmio_read_completed = 0; return 1; } @@ -4384,14 +4384,14 @@ static int write_emulate(struct kvm_vcpu *vcpu, gpa_t gpa, static int write_mmio(struct kvm_vcpu *vcpu, gpa_t gpa, int bytes, void *val) { - trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, bytes, gpa, *(u64 *)val); + trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, bytes, gpa, val); return vcpu_mmio_write(vcpu, gpa, bytes, val); } static int read_exit_mmio(struct kvm_vcpu *vcpu, gpa_t gpa, void *val, int bytes) { - trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, bytes, gpa, 0); + trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, bytes, gpa, NULL); return X86EMUL_IO_NEEDED; } diff --git a/arch/x86/mm/kaiser.c b/arch/x86/mm/kaiser.c index 6a7a77929a8c..8af98513d36c 100644 --- a/arch/x86/mm/kaiser.c +++ b/arch/x86/mm/kaiser.c @@ -198,6 +198,8 @@ static int kaiser_add_user_map(const void *__start_addr, unsigned long size, * requires that not to be #defined to 0): so mask it off here. */ flags &= ~_PAGE_GLOBAL; + if (!(__supported_pte_mask & _PAGE_NX)) + flags &= ~_PAGE_NX; for (; address < end_addr; address += PAGE_SIZE) { target_address = get_pa_from_mapping(address); diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 3f1bb4f93a5a..3146b1da6d72 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -750,11 +750,8 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size) return 1; while (cursor < to) { - if (!devmem_is_allowed(pfn)) { - pr_info("x86/PAT: Program %s tried to access /dev/mem between [mem %#010Lx-%#010Lx], PAT prevents it\n", - current->comm, from, to - 1); + if (!devmem_is_allowed(pfn)) return 0; - } cursor += PAGE_SIZE; pfn++; } diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c index 0b7a63d98440..805a3271a137 100644 --- a/arch/x86/realmode/init.c +++ b/arch/x86/realmode/init.c @@ -4,6 +4,7 @@ #include <asm/cacheflush.h> #include <asm/pgtable.h> #include <asm/realmode.h> +#include <asm/kaiser.h> struct real_mode_header *real_mode_header; u32 *trampoline_cr4_features; @@ -15,7 +16,8 @@ void __init reserve_real_mode(void) size_t size = PAGE_ALIGN(real_mode_blob_end - real_mode_blob); /* Has to be under 1M so we can execute real-mode AP code. */ - mem = memblock_find_in_range(0, 1<<20, size, PAGE_SIZE); + mem = memblock_find_in_range(0, 1 << 20, size, + KAISER_KERNEL_PGD_ALIGNMENT); if (!mem) panic("Cannot allocate trampoline\n"); diff --git a/arch/x86/realmode/rm/trampoline_64.S b/arch/x86/realmode/rm/trampoline_64.S index dac7b20d2f9d..781cca63f795 100644 --- a/arch/x86/realmode/rm/trampoline_64.S +++ b/arch/x86/realmode/rm/trampoline_64.S @@ -30,6 +30,7 @@ #include <asm/msr.h> #include <asm/segment.h> #include <asm/processor-flags.h> +#include <asm/kaiser.h> #include "realmode.h" .text @@ -139,7 +140,7 @@ tr_gdt: tr_gdt_end: .bss - .balign PAGE_SIZE + .balign KAISER_KERNEL_PGD_ALIGNMENT GLOBAL(trampoline_pgd) .space PAGE_SIZE .balign 8 diff --git a/crypto/algapi.c b/crypto/algapi.c index 43f5bdb6b570..eb58b73ca925 100644 --- a/crypto/algapi.c +++ b/crypto/algapi.c @@ -168,6 +168,18 @@ void crypto_remove_spawns(struct crypto_alg *alg, struct list_head *list, spawn->alg = NULL; spawns = &inst->alg.cra_users; + + /* + * We may encounter an unregistered instance here, since + * an instance's spawns are set up prior to the instance + * being registered. An unregistered instance will have + * NULL ->cra_users.next, since ->cra_users isn't + * properly initialized until registration. But an + * unregistered instance cannot have any users, so treat + * it the same as ->cra_users being empty. + */ + if (spawns->next == NULL) + break; } } while ((spawns = crypto_more_spawns(alg, &stack, &top, &secondary_spawns))); diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index 98504ec99c7d..59992788966c 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -223,6 +223,9 @@ config GENERIC_CPU_DEVICES config GENERIC_CPU_AUTOPROBE bool +config GENERIC_CPU_VULNERABILITIES + bool + config SOC_BUS bool diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index 91bbb1959d8d..3db71afbba93 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -498,10 +498,58 @@ static void __init cpu_dev_register_generic(void) #endif } +#ifdef CONFIG_GENERIC_CPU_VULNERABILITIES + +ssize_t __weak cpu_show_meltdown(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + +ssize_t __weak cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + +ssize_t __weak cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + +static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); +static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); +static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); + +static struct attribute *cpu_root_vulnerabilities_attrs[] = { + &dev_attr_meltdown.attr, + &dev_attr_spectre_v1.attr, + &dev_attr_spectre_v2.attr, + NULL +}; + +static const struct attribute_group cpu_root_vulnerabilities_group = { + .name = "vulnerabilities", + .attrs = cpu_root_vulnerabilities_attrs, +}; + +static void __init cpu_register_vulnerabilities(void) +{ + if (sysfs_create_group(&cpu_subsys.dev_root->kobj, + &cpu_root_vulnerabilities_group)) + pr_err("Unable to register CPU vulnerabilities\n"); +} + +#else +static inline void cpu_register_vulnerabilities(void) { } +#endif + void __init cpu_dev_init(void) { if (subsys_system_register(&cpu_subsys, cpu_root_attr_groups)) panic("Failed to register CPU subsystem"); cpu_dev_register_generic(); + cpu_register_vulnerabilities(); } diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index ca3bcc81b623..e0699a20859f 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -3767,7 +3767,7 @@ static int rbd_init_disk(struct rbd_device *rbd_dev) segment_size = rbd_obj_bytes(&rbd_dev->header); blk_queue_max_hw_sectors(q, segment_size / SECTOR_SIZE); q->limits.max_sectors = queue_max_hw_sectors(q); - blk_queue_max_segments(q, segment_size / SECTOR_SIZE); + blk_queue_max_segments(q, USHRT_MAX); blk_queue_max_segment_size(q, segment_size); blk_queue_io_min(q, segment_size); blk_queue_io_opt(q, segment_size); diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index cf25020576fa..340f96e44642 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -238,7 +238,10 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, goto out; } - mutex_lock(&reading_mutex); + if (mutex_lock_interruptible(&reading_mutex)) { + err = -ERESTARTSYS; + goto out_put; + } if (!data_avail) { bytes_read = rng_get_data(rng, rng_buffer, rng_buffer_size(), @@ -288,6 +291,7 @@ out: out_unlock_reading: mutex_unlock(&reading_mutex); +out_put: put_rng(rng); goto out; } diff --git a/drivers/char/mem.c b/drivers/char/mem.c index 2898d19fadf5..23f52a897283 100644 --- a/drivers/char/mem.c +++ b/drivers/char/mem.c @@ -70,12 +70,8 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size) u64 cursor = from; while (cursor < to) { - if (!devmem_is_allowed(pfn)) { - printk(KERN_INFO - "Program %s tried to access /dev/mem between %Lx->%Lx.\n", - current->comm, from, to); + if (!devmem_is_allowed(pfn)) return 0; - } cursor += PAGE_SIZE; pfn++; } diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c index 04fd0f2b6af0..fda8e85dd5a2 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c @@ -2678,6 +2678,8 @@ static int vmw_cmd_dx_view_define(struct vmw_private *dev_priv, } view_type = vmw_view_cmd_to_type(header->id); + if (view_type == vmw_view_max) + return -EINVAL; cmd = container_of(header, typeof(*cmd), header); ret = vmw_cmd_res_check(dev_priv, sw_context, vmw_res_surface, user_surface_converter, diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index c52131233ba7..a73874508c3a 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -957,8 +957,7 @@ static int srpt_init_ch_qp(struct srpt_rdma_ch *ch, struct ib_qp *qp) return -ENOMEM; attr->qp_state = IB_QPS_INIT; - attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_READ | - IB_ACCESS_REMOTE_WRITE; + attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE; attr->port_num = ch->sport->port; attr->pkey_index = 0; diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 64f1eb8fdcbc..347aaaa5a7ea 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -1541,13 +1541,15 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain) return -ENOMEM; arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap; - smmu_domain->pgtbl_ops = pgtbl_ops; ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg); - if (IS_ERR_VALUE(ret)) + if (IS_ERR_VALUE(ret)) { free_io_pgtable_ops(pgtbl_ops); + return ret; + } - return ret; + smmu_domain->pgtbl_ops = pgtbl_ops; + return 0; } static struct arm_smmu_group *arm_smmu_group_get(struct device *dev) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index 2ec7f90e3455..969c815c90b6 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -1527,7 +1527,8 @@ static unsigned long __scan(struct dm_bufio_client *c, unsigned long nr_to_scan, int l; struct dm_buffer *b, *tmp; unsigned long freed = 0; - unsigned long count = nr_to_scan; + unsigned long count = c->n_buffers[LIST_CLEAN] + + c->n_buffers[LIST_DIRTY]; unsigned long retain_target = get_retain_buffers(c); for (l = 0; l < LIST_SIZE; l++) { @@ -1564,6 +1565,7 @@ dm_bufio_shrink_count(struct shrinker *shrink, struct shrink_control *sc) { struct dm_bufio_client *c; unsigned long count; + unsigned long retain_target; c = container_of(shrink, struct dm_bufio_client, shrinker); if (sc->gfp_mask & __GFP_FS) @@ -1572,8 +1574,9 @@ dm_bufio_shrink_count(struct shrinker *shrink, struct shrink_control *sc) return 0; count = c->n_buffers[LIST_CLEAN] + c->n_buffers[LIST_DIRTY]; + retain_target = get_retain_buffers(c); dm_bufio_unlock(c); - return count; + return (count < retain_target) ? 0 : (count - retain_target); } /* diff --git a/drivers/media/usb/usbvision/usbvision-video.c b/drivers/media/usb/usbvision/usbvision-video.c index 91d709efef7a..cafc34938a79 100644 --- a/drivers/media/usb/usbvision/usbvision-video.c +++ b/drivers/media/usb/usbvision/usbvision-video.c @@ -1461,6 +1461,13 @@ static int usbvision_probe(struct usb_interface *intf, printk(KERN_INFO "%s: %s found\n", __func__, usbvision_device_data[model].model_string); + /* + * this is a security check. + * an exploit using an incorrect bInterfaceNumber is known + */ + if (ifnum >= USB_MAXINTERFACES || !dev->actconfig->interface[ifnum]) + return -ENODEV; + if (usbvision_device_data[model].interface >= 0) interface = &dev->actconfig->interface[usbvision_device_data[model].interface]->altsetting[0]; else if (ifnum < dev->actconfig->desc.bNumInterfaces) diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c index 27e2352fcc42..b227f81e4a7e 100644 --- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -430,7 +430,7 @@ static int gs_usb_set_bittiming(struct net_device *netdev) dev_err(netdev->dev.parent, "Couldn't set bittimings (err=%d)", rc); - return rc; + return (rc > 0) ? 0 : rc; } static void gs_usb_xmit_callback(struct urb *urb) diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c index 91a5a0ae9cd7..1908a38e7f31 100644 --- a/drivers/net/ethernet/intel/e1000e/ich8lan.c +++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c @@ -1362,6 +1362,9 @@ out: * Checks to see of the link status of the hardware has changed. If a * change in link status has been detected, then we read the PHY registers * to get the current speed/duplex if link exists. + * + * Returns a negative error code (-E1000_ERR_*) or 0 (link down) or 1 (link + * up). **/ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) { @@ -1377,7 +1380,7 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) * Change or Rx Sequence Error interrupt. */ if (!mac->get_link_status) - return 0; + return 1; /* First we want to see if the MII Status Register reports * link. If so, then we want to get the current speed/duplex @@ -1585,10 +1588,12 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) * different link partner. */ ret_val = e1000e_config_fc_after_link_up(hw); - if (ret_val) + if (ret_val) { e_dbg("Error configuring flow control\n"); + return ret_val; + } - return ret_val; + return 1; } static s32 e1000_get_variants_ich8lan(struct e1000_adapter *adapter) diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c index 479af106aaeb..424d1dee55c9 100644 --- a/drivers/net/ethernet/renesas/sh_eth.c +++ b/drivers/net/ethernet/renesas/sh_eth.c @@ -3176,18 +3176,37 @@ static int sh_eth_drv_probe(struct platform_device *pdev) /* ioremap the TSU registers */ if (mdp->cd->tsu) { struct resource *rtsu; + rtsu = platform_get_resource(pdev, IORESOURCE_MEM, 1); - mdp->tsu_addr = devm_ioremap_resource(&pdev->dev, rtsu); - if (IS_ERR(mdp->tsu_addr)) { - ret = PTR_ERR(mdp->tsu_addr); + if (!rtsu) { + dev_err(&pdev->dev, "no TSU resource\n"); + ret = -ENODEV; + goto out_release; + } + /* We can only request the TSU region for the first port + * of the two sharing this TSU for the probe to succeed... + */ + if (devno % 2 == 0 && + !devm_request_mem_region(&pdev->dev, rtsu->start, + resource_size(rtsu), + dev_name(&pdev->dev))) { + dev_err(&pdev->dev, "can't request TSU resource.\n"); + ret = -EBUSY; + goto out_release; + } + mdp->tsu_addr = devm_ioremap(&pdev->dev, rtsu->start, + resource_size(rtsu)); + if (!mdp->tsu_addr) { + dev_err(&pdev->dev, "TSU region ioremap() failed.\n"); + ret = -ENOMEM; goto out_release; } mdp->port = devno % 2; ndev->features = NETIF_F_HW_VLAN_CTAG_FILTER; } - /* initialize first or needed device */ - if (!devno || pd->needs_init) { + /* Need to init only the first port of the two sharing a TSU */ + if (devno % 2 == 0) { if (mdp->cd->chip_reset) mdp->cd->chip_reset(ndev); diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 4b100ef4af9f..5adaf537513b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -272,8 +272,14 @@ bool stmmac_eee_init(struct stmmac_priv *priv) { char *phy_bus_name = priv->plat->phy_bus_name; unsigned long flags; + int interface = priv->plat->interface; bool ret = false; + if ((interface != PHY_INTERFACE_MODE_MII) && + (interface != PHY_INTERFACE_MODE_GMII) && + !phy_interface_mode_is_rgmii(interface)) + goto out; + /* Using PCS we cannot dial with the phy registers at this stage * so we do not support extra feature like EEE. */ diff --git a/drivers/net/usb/cx82310_eth.c b/drivers/net/usb/cx82310_eth.c index e221bfcee76b..947bea81d924 100644 --- a/drivers/net/usb/cx82310_eth.c +++ b/drivers/net/usb/cx82310_eth.c @@ -293,12 +293,9 @@ static struct sk_buff *cx82310_tx_fixup(struct usbnet *dev, struct sk_buff *skb, { int len = skb->len; - if (skb_headroom(skb) < 2) { - struct sk_buff *skb2 = skb_copy_expand(skb, 2, 0, flags); + if (skb_cow_head(skb, 2)) { dev_kfree_skb_any(skb); - skb = skb2; - if (!skb) - return NULL; + return NULL; } skb_push(skb, 2); diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c index 226668ead0d8..41e9ebd7d0a6 100644 --- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -2050,14 +2050,9 @@ static struct sk_buff *lan78xx_tx_prep(struct lan78xx_net *dev, { u32 tx_cmd_a, tx_cmd_b; - if (skb_headroom(skb) < TX_OVERHEAD) { - struct sk_buff *skb2; - - skb2 = skb_copy_expand(skb, TX_OVERHEAD, 0, flags); + if (skb_cow_head(skb, TX_OVERHEAD)) { dev_kfree_skb_any(skb); - skb = skb2; - if (!skb) - return NULL; + return NULL; } if (lan78xx_linearize(skb) < 0) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index 304ec25eaf95..89950f5cea71 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -25,12 +25,13 @@ #include <uapi/linux/mdio.h> #include <linux/mdio.h> #include <linux/usb/cdc.h> +#include <linux/suspend.h> /* Information for net-next */ #define NETNEXT_VERSION "08" /* Information for net */ -#define NET_VERSION "2" +#define NET_VERSION "3" #define DRIVER_VERSION "v1." NETNEXT_VERSION "." NET_VERSION #define DRIVER_AUTHOR "Realtek linux nic maintainers <nic_swsd(a)realtek.com>" @@ -604,6 +605,9 @@ struct r8152 { struct delayed_work schedule; struct mii_if_info mii; struct mutex control; /* use for hw setting */ +#ifdef CONFIG_PM_SLEEP + struct notifier_block pm_notifier; +#endif struct rtl_ops { void (*init)(struct r8152 *); @@ -1943,7 +1947,6 @@ static void _rtl8152_set_rx_mode(struct net_device *netdev) __le32 tmp[2]; u32 ocp_data; - clear_bit(RTL8152_SET_RX_MODE, &tp->flags); netif_stop_queue(netdev); ocp_data = ocp_read_dword(tp, MCU_TYPE_PLA, PLA_RCR); ocp_data &= ~RCR_ACPT_ALL; @@ -2429,8 +2432,6 @@ static void rtl_phy_reset(struct r8152 *tp) u16 data; int i; - clear_bit(PHY_RESET, &tp->flags); - data = r8152_mdio_read(tp, MII_BMCR); /* don't reset again before the previous one complete */ @@ -2460,23 +2461,23 @@ static void r8153_teredo_off(struct r8152 *tp) ocp_write_dword(tp, MCU_TYPE_PLA, PLA_TEREDO_TIMER, 0); } -static void r8152b_disable_aldps(struct r8152 *tp) +static void r8152_aldps_en(struct r8152 *tp, bool enable) { - ocp_reg_write(tp, OCP_ALDPS_CONFIG, ENPDNPS | LINKENA | DIS_SDSAVE); - msleep(20); -} - -static inline void r8152b_enable_aldps(struct r8152 *tp) -{ - ocp_reg_write(tp, OCP_ALDPS_CONFIG, ENPWRSAVE | ENPDNPS | - LINKENA | DIS_SDSAVE); + if (enable) { + ocp_reg_write(tp, OCP_ALDPS_CONFIG, ENPWRSAVE | ENPDNPS | + LINKENA | DIS_SDSAVE); + } else { + ocp_reg_write(tp, OCP_ALDPS_CONFIG, ENPDNPS | LINKENA | + DIS_SDSAVE); + msleep(20); + } } static void rtl8152_disable(struct r8152 *tp) { - r8152b_disable_aldps(tp); + r8152_aldps_en(tp, false); rtl_disable(tp); - r8152b_enable_aldps(tp); + r8152_aldps_en(tp, true); } static void r8152b_hw_phy_cfg(struct r8152 *tp) @@ -2788,30 +2789,26 @@ static void r8153_enter_oob(struct r8152 *tp) ocp_write_dword(tp, MCU_TYPE_PLA, PLA_RCR, ocp_data); } -static void r8153_disable_aldps(struct r8152 *tp) -{ - u16 data; - - data = ocp_reg_read(tp, OCP_POWER_CFG); - data &= ~EN_ALDPS; - ocp_reg_write(tp, OCP_POWER_CFG, data); - msleep(20); -} - -static void r8153_enable_aldps(struct r8152 *tp) +static void r8153_aldps_en(struct r8152 *tp, bool enable) { u16 data; data = ocp_reg_read(tp, OCP_POWER_CFG); - data |= EN_ALDPS; - ocp_reg_write(tp, OCP_POWER_CFG, data); + if (enable) { + data |= EN_ALDPS; + ocp_reg_write(tp, OCP_POWER_CFG, data); + } else { + data &= ~EN_ALDPS; + ocp_reg_write(tp, OCP_POWER_CFG, data); + msleep(20); + } } static void rtl8153_disable(struct r8152 *tp) { - r8153_disable_aldps(tp); + r8153_aldps_en(tp, false); rtl_disable(tp); - r8153_enable_aldps(tp); + r8153_aldps_en(tp, true); usb_enable_lpm(tp->udev); } @@ -2889,10 +2886,9 @@ static int rtl8152_set_speed(struct r8152 *tp, u8 autoneg, u16 speed, u8 duplex) r8152_mdio_write(tp, MII_ADVERTISE, anar); r8152_mdio_write(tp, MII_BMCR, bmcr); - if (test_bit(PHY_RESET, &tp->flags)) { + if (test_and_clear_bit(PHY_RESET, &tp->flags)) { int i; - clear_bit(PHY_RESET, &tp->flags); for (i = 0; i < 50; i++) { msleep(20); if ((r8152_mdio_read(tp, MII_BMCR) & BMCR_RESET) == 0) @@ -2901,7 +2897,6 @@ static int rtl8152_set_speed(struct r8152 *tp, u8 autoneg, u16 speed, u8 duplex) } out: - return ret; } @@ -2910,9 +2905,9 @@ static void rtl8152_up(struct r8152 *tp) if (test_bit(RTL8152_UNPLUG, &tp->flags)) return; - r8152b_disable_aldps(tp); + r8152_aldps_en(tp, false); r8152b_exit_oob(tp); - r8152b_enable_aldps(tp); + r8152_aldps_en(tp, true); } static void rtl8152_down(struct r8152 *tp) @@ -2923,9 +2918,9 @@ static void rtl8152_down(struct r8152 *tp) } r8152_power_cut_en(tp, false); - r8152b_disable_aldps(tp); + r8152_aldps_en(tp, false); r8152b_enter_oob(tp); - r8152b_enable_aldps(tp); + r8152_aldps_en(tp, true); } static void rtl8153_up(struct r8152 *tp) @@ -2934,9 +2929,9 @@ static void rtl8153_up(struct r8152 *tp) return; r8153_u1u2en(tp, false); - r8153_disable_aldps(tp); + r8153_aldps_en(tp, false); r8153_first_init(tp); - r8153_enable_aldps(tp); + r8153_aldps_en(tp, true); r8153_u2p3en(tp, true); r8153_u1u2en(tp, true); usb_enable_lpm(tp->udev); @@ -2952,9 +2947,9 @@ static void rtl8153_down(struct r8152 *tp) r8153_u1u2en(tp, false); r8153_u2p3en(tp, false); r8153_power_cut_en(tp, false); - r8153_disable_aldps(tp); + r8153_aldps_en(tp, false); r8153_enter_oob(tp); - r8153_enable_aldps(tp); + r8153_aldps_en(tp, true); } static bool rtl8152_in_nway(struct r8152 *tp) @@ -2988,7 +2983,6 @@ static void set_carrier(struct r8152 *tp) struct net_device *netdev = tp->netdev; u8 speed; - clear_bit(RTL8152_LINK_CHG, &tp->flags); speed = rtl8152_get_speed(tp); if (speed & LINK_STATUS) { @@ -3038,20 +3032,18 @@ static void rtl_work_func_t(struct work_struct *work) goto out1; } - if (test_bit(RTL8152_LINK_CHG, &tp->flags)) + if (test_and_clear_bit(RTL8152_LINK_CHG, &tp->flags)) set_carrier(tp); - if (test_bit(RTL8152_SET_RX_MODE, &tp->flags)) + if (test_and_clear_bit(RTL8152_SET_RX_MODE, &tp->flags)) _rtl8152_set_rx_mode(tp->netdev); /* don't schedule napi before linking */ - if (test_bit(SCHEDULE_NAPI, &tp->flags) && - netif_carrier_ok(tp->netdev)) { - clear_bit(SCHEDULE_NAPI, &tp->flags); + if (test_and_clear_bit(SCHEDULE_NAPI, &tp->flags) && + netif_carrier_ok(tp->netdev)) napi_schedule(&tp->napi); - } - if (test_bit(PHY_RESET, &tp->flags)) + if (test_and_clear_bit(PHY_RESET, &tp->flags)) rtl_phy_reset(tp); mutex_unlock(&tp->control); @@ -3060,6 +3052,33 @@ out1: usb_autopm_put_interface(tp->intf); } +#ifdef CONFIG_PM_SLEEP +static int rtl_notifier(struct notifier_block *nb, unsigned long action, + void *data) +{ + struct r8152 *tp = container_of(nb, struct r8152, pm_notifier); + + switch (action) { + case PM_HIBERNATION_PREPARE: + case PM_SUSPEND_PREPARE: + usb_autopm_get_interface(tp->intf); + break; + + case PM_POST_HIBERNATION: + case PM_POST_SUSPEND: + usb_autopm_put_interface(tp->intf); + break; + + case PM_POST_RESTORE: + case PM_RESTORE_PREPARE: + default: + break; + } + + return NOTIFY_DONE; +} +#endif + static int rtl8152_open(struct net_device *netdev) { struct r8152 *tp = netdev_priv(netdev); @@ -3102,6 +3121,10 @@ static int rtl8152_open(struct net_device *netdev) mutex_unlock(&tp->control); usb_autopm_put_interface(tp->intf); +#ifdef CONFIG_PM_SLEEP + tp->pm_notifier.notifier_call = rtl_notifier; + register_pm_notifier(&tp->pm_notifier); +#endif out: return res; @@ -3112,6 +3135,9 @@ static int rtl8152_close(struct net_device *netdev) struct r8152 *tp = netdev_priv(netdev); int res = 0; +#ifdef CONFIG_PM_SLEEP + unregister_pm_notifier(&tp->pm_notifier); +#endif napi_disable(&tp->napi); clear_bit(WORK_ENABLE, &tp->flags); usb_kill_urb(tp->intr_urb); @@ -3250,7 +3276,7 @@ static void r8152b_init(struct r8152 *tp) if (test_bit(RTL8152_UNPLUG, &tp->flags)) return; - r8152b_disable_aldps(tp); + r8152_aldps_en(tp, false); if (tp->version == RTL_VER_01) { ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_LED_FEATURE); @@ -3272,7 +3298,7 @@ static void r8152b_init(struct r8152 *tp) ocp_write_word(tp, MCU_TYPE_PLA, PLA_GPHY_INTR_IMR, ocp_data); r8152b_enable_eee(tp); - r8152b_enable_aldps(tp); + r8152_aldps_en(tp, true); r8152b_enable_fc(tp); rtl_tally_reset(tp); @@ -3290,7 +3316,7 @@ static void r8153_init(struct r8152 *tp) if (test_bit(RTL8152_UNPLUG, &tp->flags)) return; - r8153_disable_aldps(tp); + r8153_aldps_en(tp, false); r8153_u1u2en(tp, false); for (i = 0; i < 500; i++) { @@ -3379,7 +3405,7 @@ static void r8153_init(struct r8152 *tp) EEE_SPDWN_EN); r8153_enable_eee(tp); - r8153_enable_aldps(tp); + r8153_aldps_en(tp, true); r8152b_enable_fc(tp); rtl_tally_reset(tp); r8153_u2p3en(tp, true); diff --git a/drivers/net/usb/smsc75xx.c b/drivers/net/usb/smsc75xx.c index 30033dbe6662..c5f375befd2f 100644 --- a/drivers/net/usb/smsc75xx.c +++ b/drivers/net/usb/smsc75xx.c @@ -2193,13 +2193,9 @@ static struct sk_buff *smsc75xx_tx_fixup(struct usbnet *dev, { u32 tx_cmd_a, tx_cmd_b; - if (skb_headroom(skb) < SMSC75XX_TX_OVERHEAD) { - struct sk_buff *skb2 = - skb_copy_expand(skb, SMSC75XX_TX_OVERHEAD, 0, flags); + if (skb_cow_head(skb, SMSC75XX_TX_OVERHEAD)) { dev_kfree_skb_any(skb); - skb = skb2; - if (!skb) - return NULL; + return NULL; } tx_cmd_a = (u32)(skb->len & TX_CMD_A_LEN) | TX_CMD_A_FCS; diff --git a/drivers/net/usb/sr9700.c b/drivers/net/usb/sr9700.c index 4a1e9c489f1f..aadfe1d1c37e 100644 --- a/drivers/net/usb/sr9700.c +++ b/drivers/net/usb/sr9700.c @@ -456,14 +456,9 @@ static struct sk_buff *sr9700_tx_fixup(struct usbnet *dev, struct sk_buff *skb, len = skb->len; - if (skb_headroom(skb) < SR_TX_OVERHEAD) { - struct sk_buff *skb2; - - skb2 = skb_copy_expand(skb, SR_TX_OVERHEAD, 0, flags); + if (skb_cow_head(skb, SR_TX_OVERHEAD)) { dev_kfree_skb_any(skb); - skb = skb2; - if (!skb) - return NULL; + return NULL; } __skb_push(skb, SR_TX_OVERHEAD); diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index 9c6357c03905..b64327722660 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -759,10 +759,12 @@ static long ashmem_ioctl(struct file *file, unsigned int cmd, unsigned long arg) break; case ASHMEM_SET_SIZE: ret = -EINVAL; + mutex_lock(&ashmem_mutex); if (!asma->file) { ret = 0; asma->size = (size_t)arg; } + mutex_unlock(&ashmem_mutex); break; case ASHMEM_GET_SIZE: ret = asma->size; diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 8a4092cd97ee..58fe27705b96 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -1759,7 +1759,6 @@ iscsit_handle_task_mgt_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd, struct iscsi_tmr_req *tmr_req; struct iscsi_tm *hdr; int out_of_order_cmdsn = 0, ret; - bool sess_ref = false; u8 function, tcm_function = TMR_UNKNOWN; hdr = (struct iscsi_tm *) buf; @@ -1801,18 +1800,17 @@ iscsit_handle_task_mgt_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd, buf); } + transport_init_se_cmd(&cmd->se_cmd, &iscsi_ops, + conn->sess->se_sess, 0, DMA_NONE, + TCM_SIMPLE_TAG, cmd->sense_buffer + 2); + + target_get_sess_cmd(&cmd->se_cmd, true); + /* * TASK_REASSIGN for ERL=2 / connection stays inside of * LIO-Target $FABRIC_MOD */ if (function != ISCSI_TM_FUNC_TASK_REASSIGN) { - transport_init_se_cmd(&cmd->se_cmd, &iscsi_ops, - conn->sess->se_sess, 0, DMA_NONE, - TCM_SIMPLE_TAG, cmd->sense_buffer + 2); - - target_get_sess_cmd(&cmd->se_cmd, true); - sess_ref = true; - switch (function) { case ISCSI_TM_FUNC_ABORT_TASK: tcm_function = TMR_ABORT_TASK; @@ -1951,12 +1949,8 @@ attach: * For connection recovery, this is also the default action for * TMR TASK_REASSIGN. */ - if (sess_ref) { - pr_debug("Handle TMR, using sess_ref=true check\n"); - target_put_sess_cmd(&cmd->se_cmd); - } - iscsit_add_cmd_to_response_queue(cmd, conn, cmd->i_state); + target_put_sess_cmd(&cmd->se_cmd); return 0; } EXPORT_SYMBOL(iscsit_handle_task_mgt_cmd); diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c index c9be953496ec..e926dd52b6b5 100644 --- a/drivers/target/target_core_tmr.c +++ b/drivers/target/target_core_tmr.c @@ -133,6 +133,15 @@ static bool __target_check_io_state(struct se_cmd *se_cmd, spin_unlock(&se_cmd->t_state_lock); return false; } + if (se_cmd->transport_state & CMD_T_PRE_EXECUTE) { + if (se_cmd->scsi_status) { + pr_debug("Attempted to abort io tag: %llu early failure" + " status: 0x%02x\n", se_cmd->tag, + se_cmd->scsi_status); + spin_unlock(&se_cmd->t_state_lock); + return false; + } + } if (sess->sess_tearing_down || se_cmd->cmd_wait_set) { pr_debug("Attempted to abort io tag: %llu already shutdown," " skipping\n", se_cmd->tag); diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index 37abf881ca75..21f888ac550e 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -1933,6 +1933,7 @@ void target_execute_cmd(struct se_cmd *cmd) } cmd->t_state = TRANSPORT_PROCESSING; + cmd->transport_state &= ~CMD_T_PRE_EXECUTE; cmd->transport_state |= CMD_T_ACTIVE|CMD_T_BUSY|CMD_T_SENT; spin_unlock_irq(&cmd->t_state_lock); @@ -2572,6 +2573,7 @@ int target_get_sess_cmd(struct se_cmd *se_cmd, bool ack_kref) ret = -ESHUTDOWN; goto out; } + se_cmd->transport_state |= CMD_T_PRE_EXECUTE; list_add_tail(&se_cmd->se_cmd_list, &se_sess->sess_cmd_list); out: spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c index b07f864f68e8..ed27fda13387 100644 --- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -133,6 +133,12 @@ static void sysrq_handle_crash(int key) { char *killer = NULL; + /* we need to release the RCU read lock here, + * otherwise we get an annoying + * 'BUG: sleeping function called from invalid context' + * complaint from the kernel before the panic. + */ + rcu_read_unlock(); panic_on_oops = 1; /* force panic */ wmb(); *killer = 1; diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index f7481c4e2bc9..d9363713b7f1 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -1071,7 +1071,8 @@ int xhci_alloc_virt_device(struct xhci_hcd *xhci, int slot_id, return 1; fail: - + if (dev->eps[0].ring) + xhci_ring_free(xhci, dev->eps[0].ring); if (dev->in_ctx) xhci_free_container_ctx(xhci, dev->in_ctx); if (dev->out_ctx) diff --git a/drivers/usb/misc/usb3503.c b/drivers/usb/misc/usb3503.c index b45cb77c0744..9e8789877763 100644 --- a/drivers/usb/misc/usb3503.c +++ b/drivers/usb/misc/usb3503.c @@ -292,6 +292,8 @@ static int usb3503_probe(struct usb3503 *hub) if (gpio_is_valid(hub->gpio_reset)) { err = devm_gpio_request_one(dev, hub->gpio_reset, GPIOF_OUT_INIT_LOW, "usb3503 reset"); + /* Datasheet defines a hardware reset to be at least 100us */ + usleep_range(100, 10000); if (err) { dev_err(dev, "unable to request GPIO %d as reset pin (%d)\n", diff --git a/drivers/usb/mon/mon_bin.c b/drivers/usb/mon/mon_bin.c index 3598f1a62673..251d123d9046 100644 --- a/drivers/usb/mon/mon_bin.c +++ b/drivers/usb/mon/mon_bin.c @@ -1001,7 +1001,9 @@ static long mon_bin_ioctl(struct file *file, unsigned int cmd, unsigned long arg break; case MON_IOCQ_RING_SIZE: + mutex_lock(&rp->fetch_lock); ret = rp->b_size; + mutex_unlock(&rp->fetch_lock); break; case MON_IOCT_RING_SIZE: @@ -1228,12 +1230,16 @@ static int mon_bin_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf) unsigned long offset, chunk_idx; struct page *pageptr; + mutex_lock(&rp->fetch_lock); offset = vmf->pgoff << PAGE_SHIFT; - if (offset >= rp->b_size) + if (offset >= rp->b_size) { + mutex_unlock(&rp->fetch_lock); return VM_FAULT_SIGBUS; + } chunk_idx = offset / CHUNK_SIZE; pageptr = rp->b_vec[chunk_idx].pg; get_page(pageptr); + mutex_unlock(&rp->fetch_lock); vmf->page = pageptr; return 0; } diff --git a/drivers/usb/musb/ux500.c b/drivers/usb/musb/ux500.c index b2685e75a683..3eaa4ba6867d 100644 --- a/drivers/usb/musb/ux500.c +++ b/drivers/usb/musb/ux500.c @@ -348,7 +348,9 @@ static int ux500_suspend(struct device *dev) struct ux500_glue *glue = dev_get_drvdata(dev); struct musb *musb = glue_to_musb(glue); - usb_phy_set_suspend(musb->xceiv, 1); + if (musb) + usb_phy_set_suspend(musb->xceiv, 1); + clk_disable_unprepare(glue->clk); return 0; @@ -366,7 +368,8 @@ static int ux500_resume(struct device *dev) return ret; } - usb_phy_set_suspend(musb->xceiv, 0); + if (musb) + usb_phy_set_suspend(musb->xceiv, 0); return 0; } diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c index 1f5ecf905b7d..a4ab4fdf5ba3 100644 --- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -120,6 +120,7 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x10C4, 0x8470) }, /* Juniper Networks BX Series System Console */ { USB_DEVICE(0x10C4, 0x8477) }, /* Balluff RFID */ { USB_DEVICE(0x10C4, 0x84B6) }, /* Starizona Hyperion */ + { USB_DEVICE(0x10C4, 0x85A7) }, /* LifeScan OneTouch Verio IQ */ { USB_DEVICE(0x10C4, 0x85EA) }, /* AC-Services IBUS-IF */ { USB_DEVICE(0x10C4, 0x85EB) }, /* AC-Services CIS-IBUS */ { USB_DEVICE(0x10C4, 0x85F8) }, /* Virtenio Preon32 */ @@ -170,6 +171,7 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x1843, 0x0200) }, /* Vaisala USB Instrument Cable */ { USB_DEVICE(0x18EF, 0xE00F) }, /* ELV USB-I2C-Interface */ { USB_DEVICE(0x18EF, 0xE025) }, /* ELV Marble Sound Board 1 */ + { USB_DEVICE(0x18EF, 0xE030) }, /* ELV ALC 8xxx Battery Charger */ { USB_DEVICE(0x18EF, 0xE032) }, /* ELV TFD500 Data Logger */ { USB_DEVICE(0x1901, 0x0190) }, /* GE B850 CP2105 Recorder interface */ { USB_DEVICE(0x1901, 0x0193) }, /* GE B650 CP2104 PMC interface */ diff --git a/drivers/usb/storage/unusual_uas.h b/drivers/usb/storage/unusual_uas.h index 2f80163ffb94..8ed80f28416f 100644 --- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -155,6 +155,13 @@ UNUSUAL_DEV(0x2109, 0x0711, 0x0000, 0x9999, USB_SC_DEVICE, USB_PR_DEVICE, NULL, US_FL_NO_ATA_1X), +/* Reported-by: Icenowy Zheng <icenowy(a)aosc.io> */ +UNUSUAL_DEV(0x2537, 0x1068, 0x0000, 0x9999, + "Norelsys", + "NS1068X", + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_IGNORE_UAS), + /* Reported-by: Takeo Nakayama <javhera(a)gmx.com> */ UNUSUAL_DEV(0x357d, 0x7788, 0x0000, 0x9999, "JMicron", diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c index e40da7759a0e..9752b93f754e 100644 --- a/drivers/usb/usbip/usbip_common.c +++ b/drivers/usb/usbip/usbip_common.c @@ -103,7 +103,7 @@ static void usbip_dump_usb_device(struct usb_device *udev) dev_dbg(dev, " devnum(%d) devpath(%s) usb speed(%s)", udev->devnum, udev->devpath, usb_speed_string(udev->speed)); - pr_debug("tt %p, ttport %d\n", udev->tt, udev->ttport); + pr_debug("tt hub ttport %d\n", udev->ttport); dev_dbg(dev, " "); for (i = 0; i < 16; i++) @@ -136,12 +136,8 @@ static void usbip_dump_usb_device(struct usb_device *udev) } pr_debug("\n"); - dev_dbg(dev, "parent %p, bus %p\n", udev->parent, udev->bus); - - dev_dbg(dev, - "descriptor %p, config %p, actconfig %p, rawdescriptors %p\n", - &udev->descriptor, udev->config, - udev->actconfig, udev->rawdescriptors); + dev_dbg(dev, "parent %s, bus %s\n", dev_name(&udev->parent->dev), + udev->bus->bus_name); dev_dbg(dev, "have_langid %d, string_langid %d\n", udev->have_langid, udev->string_langid); @@ -249,9 +245,6 @@ void usbip_dump_urb(struct urb *urb) dev = &urb->dev->dev; - dev_dbg(dev, " urb :%p\n", urb); - dev_dbg(dev, " dev :%p\n", urb->dev); - usbip_dump_usb_device(urb->dev); dev_dbg(dev, " pipe :%08x ", urb->pipe); @@ -260,11 +253,9 @@ void usbip_dump_urb(struct urb *urb) dev_dbg(dev, " status :%d\n", urb->status); dev_dbg(dev, " transfer_flags :%08X\n", urb->transfer_flags); - dev_dbg(dev, " transfer_buffer :%p\n", urb->transfer_buffer); dev_dbg(dev, " transfer_buffer_length:%d\n", urb->transfer_buffer_length); dev_dbg(dev, " actual_length :%d\n", urb->actual_length); - dev_dbg(dev, " setup_packet :%p\n", urb->setup_packet); if (urb->setup_packet && usb_pipetype(urb->pipe) == PIPE_CONTROL) usbip_dump_usb_ctrlrequest( @@ -274,8 +265,6 @@ void usbip_dump_urb(struct urb *urb) dev_dbg(dev, " number_of_packets :%d\n", urb->number_of_packets); dev_dbg(dev, " interval :%d\n", urb->interval); dev_dbg(dev, " error_count :%d\n", urb->error_count); - dev_dbg(dev, " context :%p\n", urb->context); - dev_dbg(dev, " complete :%p\n", urb->complete); } EXPORT_SYMBOL_GPL(usbip_dump_urb); diff --git a/fs/locks.c b/fs/locks.c index 8eddae23e10b..b515e65f1376 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -2220,10 +2220,12 @@ int fcntl_setlk(unsigned int fd, struct file *filp, unsigned int cmd, error = do_lock_file_wait(filp, cmd, file_lock); /* - * Attempt to detect a close/fcntl race and recover by - * releasing the lock that was just acquired. + * Attempt to detect a close/fcntl race and recover by releasing the + * lock that was just acquired. There is no need to do that when we're + * unlocking though, or for OFD locks. */ - if (!error && file_lock->fl_type != F_UNLCK) { + if (!error && file_lock->fl_type != F_UNLCK && + !(file_lock->fl_flags & FL_OFDLCK)) { /* * We need that spin_lock here - it prevents reordering between * update of i_flctx->flc_posix and check for it done in @@ -2362,10 +2364,12 @@ int fcntl_setlk64(unsigned int fd, struct file *filp, unsigned int cmd, error = do_lock_file_wait(filp, cmd, file_lock); /* - * Attempt to detect a close/fcntl race and recover by - * releasing the lock that was just acquired. + * Attempt to detect a close/fcntl race and recover by releasing the + * lock that was just acquired. There is no need to do that when we're + * unlocking though, or for OFD locks. */ - if (!error && file_lock->fl_type != F_UNLCK) { + if (!error && file_lock->fl_type != F_UNLCK && + !(file_lock->fl_flags & FL_OFDLCK)) { /* * We need that spin_lock here - it prevents reordering between * update of i_flctx->flc_posix and check for it done in diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 4f6d29c8e3d8..f2157159b26f 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -37,6 +37,7 @@ struct bpf_map { u32 value_size; u32 max_entries; u32 pages; + bool unpriv_array; struct user_struct *user; const struct bpf_map_ops *ops; struct work_struct work; @@ -141,6 +142,7 @@ struct bpf_prog_aux { struct bpf_array { struct bpf_map map; u32 elem_size; + u32 index_mask; /* 'ownership' of prog_array is claimed by the first program that * is going to use this map or by the first program which FD is stored * in the map to make sure that all callers and callees have the same diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 3ea9aae2387d..7e04bcd9af8e 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -40,6 +40,13 @@ extern void cpu_remove_dev_attr(struct device_attribute *attr); extern int cpu_add_dev_attr_group(struct attribute_group *attrs); extern void cpu_remove_dev_attr_group(struct attribute_group *attrs); +extern ssize_t cpu_show_meltdown(struct device *dev, + struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf); + extern __printf(4, 5) struct device *cpu_device_create(struct device *parent, void *drvdata, const struct attribute_group **groups, diff --git a/include/linux/filter.h b/include/linux/filter.h index ccb98b459c59..677fa3b42194 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -466,6 +466,9 @@ u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); void bpf_int_jit_compile(struct bpf_prog *fp); bool bpf_helper_changes_skb_data(void *func); +struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off, + const struct bpf_insn *patch, u32 len); + #ifdef CONFIG_BPF_JIT typedef void (*bpf_jit_fill_hole_t)(void *area, unsigned int size); diff --git a/include/linux/phy.h b/include/linux/phy.h index 5bc4b9d563a9..dbfd5ce9350f 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -682,6 +682,17 @@ static inline bool phy_is_internal(struct phy_device *phydev) return phydev->is_internal; } +/** + * phy_interface_mode_is_rgmii - Convenience function for testing if a + * PHY interface mode is RGMII (all variants) + * @mode: the phy_interface_t enum + */ +static inline bool phy_interface_mode_is_rgmii(phy_interface_t mode) +{ + return mode >= PHY_INTERFACE_MODE_RGMII && + mode <= PHY_INTERFACE_MODE_RGMII_TXID; +}; + /** * phy_interface_is_rgmii - Convenience function for testing if a PHY interface * is RGMII (all variants) diff --git a/include/linux/sh_eth.h b/include/linux/sh_eth.h index 8c9131db2b25..b050ef51e27e 100644 --- a/include/linux/sh_eth.h +++ b/include/linux/sh_eth.h @@ -16,7 +16,6 @@ struct sh_eth_plat_data { unsigned char mac_addr[ETH_ALEN]; unsigned no_ether_link:1; unsigned ether_link_active_low:1; - unsigned needs_init:1; }; #endif diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h index 9982a2bcb880..0eed9fd79ea5 100644 --- a/include/target/target_core_base.h +++ b/include/target/target_core_base.h @@ -496,6 +496,7 @@ struct se_cmd { #define CMD_T_BUSY (1 << 9) #define CMD_T_TAS (1 << 10) #define CMD_T_FABRIC_STOP (1 << 11) +#define CMD_T_PRE_EXECUTE (1 << 12) spinlock_t t_state_lock; struct kref cmd_kref; struct completion t_transport_stop_comp; diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h index d6f83222a6a1..67ff6555967f 100644 --- a/include/trace/events/kvm.h +++ b/include/trace/events/kvm.h @@ -204,7 +204,7 @@ TRACE_EVENT(kvm_ack_irq, { KVM_TRACE_MMIO_WRITE, "write" } TRACE_EVENT(kvm_mmio, - TP_PROTO(int type, int len, u64 gpa, u64 val), + TP_PROTO(int type, int len, u64 gpa, void *val), TP_ARGS(type, len, gpa, val), TP_STRUCT__entry( @@ -218,7 +218,10 @@ TRACE_EVENT(kvm_mmio, __entry->type = type; __entry->len = len; __entry->gpa = gpa; - __entry->val = val; + __entry->val = 0; + if (val) + memcpy(&__entry->val, val, + min_t(u32, sizeof(__entry->val), len)); ), TP_printk("mmio %s len %u gpa 0x%llx val 0x%llx", diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index b0799bced518..3608fa1aec8a 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -20,8 +20,10 @@ /* Called from syscall */ static struct bpf_map *array_map_alloc(union bpf_attr *attr) { + u32 elem_size, array_size, index_mask, max_entries; + bool unpriv = !capable(CAP_SYS_ADMIN); struct bpf_array *array; - u32 elem_size, array_size; + u64 mask64; /* check sanity of attributes */ if (attr->max_entries == 0 || attr->key_size != 4 || @@ -36,12 +38,33 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) elem_size = round_up(attr->value_size, 8); + max_entries = attr->max_entries; + + /* On 32 bit archs roundup_pow_of_two() with max_entries that has + * upper most bit set in u32 space is undefined behavior due to + * resulting 1U << 32, so do it manually here in u64 space. + */ + mask64 = fls_long(max_entries - 1); + mask64 = 1ULL << mask64; + mask64 -= 1; + + index_mask = mask64; + if (unpriv) { + /* round up array size to nearest power of 2, + * since cpu will speculate within index_mask limits + */ + max_entries = index_mask + 1; + /* Check for overflows. */ + if (max_entries < attr->max_entries) + return ERR_PTR(-E2BIG); + } + /* check round_up into zero and u32 overflow */ if (elem_size == 0 || - attr->max_entries > (U32_MAX - PAGE_SIZE - sizeof(*array)) / elem_size) + max_entries > (U32_MAX - PAGE_SIZE - sizeof(*array)) / elem_size) return ERR_PTR(-ENOMEM); - array_size = sizeof(*array) + attr->max_entries * elem_size; + array_size = sizeof(*array) + max_entries * elem_size; /* allocate all map elements and zero-initialize them */ array = kzalloc(array_size, GFP_USER | __GFP_NOWARN); @@ -50,6 +73,8 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) if (!array) return ERR_PTR(-ENOMEM); } + array->index_mask = index_mask; + array->map.unpriv_array = unpriv; /* copy mandatory map attributes */ array->map.key_size = attr->key_size; @@ -70,7 +95,7 @@ static void *array_map_lookup_elem(struct bpf_map *map, void *key) if (index >= array->map.max_entries) return NULL; - return array->value + array->elem_size * index; + return array->value + array->elem_size * (index & array->index_mask); } /* Called from syscall */ @@ -111,7 +136,9 @@ static int array_map_update_elem(struct bpf_map *map, void *key, void *value, /* all elements already exist */ return -EEXIST; - memcpy(array->value + array->elem_size * index, value, map->value_size); + memcpy(array->value + + array->elem_size * (index & array->index_mask), + value, map->value_size); return 0; } diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 334b1bdd572c..3fd76cf0c21e 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -137,6 +137,77 @@ void __bpf_prog_free(struct bpf_prog *fp) } EXPORT_SYMBOL_GPL(__bpf_prog_free); +static bool bpf_is_jmp_and_has_target(const struct bpf_insn *insn) +{ + return BPF_CLASS(insn->code) == BPF_JMP && + /* Call and Exit are both special jumps with no + * target inside the BPF instruction image. + */ + BPF_OP(insn->code) != BPF_CALL && + BPF_OP(insn->code) != BPF_EXIT; +} + +static void bpf_adj_branches(struct bpf_prog *prog, u32 pos, u32 delta) +{ + struct bpf_insn *insn = prog->insnsi; + u32 i, insn_cnt = prog->len; + + for (i = 0; i < insn_cnt; i++, insn++) { + if (!bpf_is_jmp_and_has_target(insn)) + continue; + + /* Adjust offset of jmps if we cross boundaries. */ + if (i < pos && i + insn->off + 1 > pos) + insn->off += delta; + else if (i > pos + delta && i + insn->off + 1 <= pos + delta) + insn->off -= delta; + } +} + +struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off, + const struct bpf_insn *patch, u32 len) +{ + u32 insn_adj_cnt, insn_rest, insn_delta = len - 1; + struct bpf_prog *prog_adj; + + /* Since our patchlet doesn't expand the image, we're done. */ + if (insn_delta == 0) { + memcpy(prog->insnsi + off, patch, sizeof(*patch)); + return prog; + } + + insn_adj_cnt = prog->len + insn_delta; + + /* Several new instructions need to be inserted. Make room + * for them. Likely, there's no need for a new allocation as + * last page could have large enough tailroom. + */ + prog_adj = bpf_prog_realloc(prog, bpf_prog_size(insn_adj_cnt), + GFP_USER); + if (!prog_adj) + return NULL; + + prog_adj->len = insn_adj_cnt; + + /* Patching happens in 3 steps: + * + * 1) Move over tail of insnsi from next instruction onwards, + * so we can patch the single target insn with one or more + * new ones (patching is always from 1 to n insns, n > 0). + * 2) Inject new instructions at the target location. + * 3) Adjust branch offsets if necessary. + */ + insn_rest = insn_adj_cnt - off - len; + + memmove(prog_adj->insnsi + off + len, prog_adj->insnsi + off + 1, + sizeof(*patch) * insn_rest); + memcpy(prog_adj->insnsi + off, patch, sizeof(*patch) * len); + + bpf_adj_branches(prog_adj, off, insn_delta); + + return prog_adj; +} + #ifdef CONFIG_BPF_JIT struct bpf_binary_header * bpf_jit_binary_alloc(unsigned int proglen, u8 **image_ptr, diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 4e32cc94edd9..424accd20c2d 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -447,57 +447,6 @@ void bpf_register_prog_type(struct bpf_prog_type_list *tl) list_add(&tl->list_node, &bpf_prog_types); } -/* fixup insn->imm field of bpf_call instructions: - * if (insn->imm == BPF_FUNC_map_lookup_elem) - * insn->imm = bpf_map_lookup_elem - __bpf_call_base; - * else if (insn->imm == BPF_FUNC_map_update_elem) - * insn->imm = bpf_map_update_elem - __bpf_call_base; - * else ... - * - * this function is called after eBPF program passed verification - */ -static void fixup_bpf_calls(struct bpf_prog *prog) -{ - const struct bpf_func_proto *fn; - int i; - - for (i = 0; i < prog->len; i++) { - struct bpf_insn *insn = &prog->insnsi[i]; - - if (insn->code == (BPF_JMP | BPF_CALL)) { - /* we reach here when program has bpf_call instructions - * and it passed bpf_check(), means that - * ops->get_func_proto must have been supplied, check it - */ - BUG_ON(!prog->aux->ops->get_func_proto); - - if (insn->imm == BPF_FUNC_get_route_realm) - prog->dst_needed = 1; - if (insn->imm == BPF_FUNC_get_prandom_u32) - bpf_user_rnd_init_once(); - if (insn->imm == BPF_FUNC_tail_call) { - /* mark bpf_tail_call as different opcode - * to avoid conditional branch in - * interpeter for every normal call - * and to prevent accidental JITing by - * JIT compiler that doesn't support - * bpf_tail_call yet - */ - insn->imm = 0; - insn->code |= BPF_X; - continue; - } - - fn = prog->aux->ops->get_func_proto(insn->imm); - /* all functions that have prototype and verifier allowed - * programs to call them, must be real in-kernel functions - */ - BUG_ON(!fn->func); - insn->imm = fn->func - __bpf_call_base; - } - } -} - /* drop refcnt on maps used by eBPF program and free auxilary data */ static void free_used_maps(struct bpf_prog_aux *aux) { @@ -680,9 +629,6 @@ static int bpf_prog_load(union bpf_attr *attr) if (err < 0) goto free_used_maps; - /* fixup BPF_CALL->imm field */ - fixup_bpf_calls(prog); - /* eBPF program is ready to be JITed */ err = bpf_prog_select_runtime(prog); if (err < 0) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index eb759f5008b8..014c2d759916 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -186,6 +186,13 @@ struct verifier_stack_elem { struct verifier_stack_elem *next; }; +struct bpf_insn_aux_data { + union { + enum bpf_reg_type ptr_type; /* pointer type for load/store insns */ + struct bpf_map *map_ptr; /* pointer for call insn into lookup_elem */ + }; +}; + #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */ /* single container for all structs @@ -200,6 +207,7 @@ struct verifier_env { struct bpf_map *used_maps[MAX_USED_MAPS]; /* array of map's used by eBPF program */ u32 used_map_cnt; /* number of used maps */ bool allow_ptr_leaks; + struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */ }; /* verbose verifier prints what it's seeing @@ -945,7 +953,7 @@ error: return -EINVAL; } -static int check_call(struct verifier_env *env, int func_id) +static int check_call(struct verifier_env *env, int func_id, int insn_idx) { struct verifier_state *state = &env->cur_state; const struct bpf_func_proto *fn = NULL; @@ -981,6 +989,13 @@ static int check_call(struct verifier_env *env, int func_id) err = check_func_arg(env, BPF_REG_2, fn->arg2_type, &map); if (err) return err; + if (func_id == BPF_FUNC_tail_call) { + if (map == NULL) { + verbose("verifier bug\n"); + return -EINVAL; + } + env->insn_aux_data[insn_idx].map_ptr = map; + } err = check_func_arg(env, BPF_REG_3, fn->arg3_type, &map); if (err) return err; @@ -1784,7 +1799,7 @@ static int do_check(struct verifier_env *env) return err; } else if (class == BPF_LDX) { - enum bpf_reg_type src_reg_type; + enum bpf_reg_type *prev_src_type, src_reg_type; /* check for reserved fields is already done */ @@ -1813,16 +1828,18 @@ static int do_check(struct verifier_env *env) continue; } - if (insn->imm == 0) { + prev_src_type = &env->insn_aux_data[insn_idx].ptr_type; + + if (*prev_src_type == NOT_INIT) { /* saw a valid insn * dst_reg = *(u32 *)(src_reg + off) - * use reserved 'imm' field to mark this insn + * save type to validate intersecting paths */ - insn->imm = src_reg_type; + *prev_src_type = src_reg_type; - } else if (src_reg_type != insn->imm && + } else if (src_reg_type != *prev_src_type && (src_reg_type == PTR_TO_CTX || - insn->imm == PTR_TO_CTX)) { + *prev_src_type == PTR_TO_CTX)) { /* ABuser program is trying to use the same insn * dst_reg = *(u32*) (src_reg + off) * with different pointer types: @@ -1835,7 +1852,7 @@ static int do_check(struct verifier_env *env) } } else if (class == BPF_STX) { - enum bpf_reg_type dst_reg_type; + enum bpf_reg_type *prev_dst_type, dst_reg_type; if (BPF_MODE(insn->code) == BPF_XADD) { err = check_xadd(env, insn); @@ -1863,11 +1880,13 @@ static int do_check(struct verifier_env *env) if (err) return err; - if (insn->imm == 0) { - insn->imm = dst_reg_type; - } else if (dst_reg_type != insn->imm && + prev_dst_type = &env->insn_aux_data[insn_idx].ptr_type; + + if (*prev_dst_type == NOT_INIT) { + *prev_dst_type = dst_reg_type; + } else if (dst_reg_type != *prev_dst_type && (dst_reg_type == PTR_TO_CTX || - insn->imm == PTR_TO_CTX)) { + *prev_dst_type == PTR_TO_CTX)) { verbose("same insn cannot be used with different pointers\n"); return -EINVAL; } @@ -1902,7 +1921,7 @@ static int do_check(struct verifier_env *env) return -EINVAL; } - err = check_call(env, insn->imm); + err = check_call(env, insn->imm, insn_idx); if (err) return err; @@ -2098,24 +2117,39 @@ static void convert_pseudo_ld_imm64(struct verifier_env *env) insn->src_reg = 0; } -static void adjust_branches(struct bpf_prog *prog, int pos, int delta) +/* single env->prog->insni[off] instruction was replaced with the range + * insni[off, off + cnt). Adjust corresponding insn_aux_data by copying + * [0, off) and [off, end) to new locations, so the patched range stays zero + */ +static int adjust_insn_aux_data(struct verifier_env *env, u32 prog_len, + u32 off, u32 cnt) { - struct bpf_insn *insn = prog->insnsi; - int insn_cnt = prog->len; - int i; + struct bpf_insn_aux_data *new_data, *old_data = env->insn_aux_data; - for (i = 0; i < insn_cnt; i++, insn++) { - if (BPF_CLASS(insn->code) != BPF_JMP || - BPF_OP(insn->code) == BPF_CALL || - BPF_OP(insn->code) == BPF_EXIT) - continue; + if (cnt == 1) + return 0; + new_data = vzalloc(sizeof(struct bpf_insn_aux_data) * prog_len); + if (!new_data) + return -ENOMEM; + memcpy(new_data, old_data, sizeof(struct bpf_insn_aux_data) * off); + memcpy(new_data + off + cnt - 1, old_data + off, + sizeof(struct bpf_insn_aux_data) * (prog_len - off - cnt + 1)); + env->insn_aux_data = new_data; + vfree(old_data); + return 0; +} - /* adjust offset of jmps if necessary */ - if (i < pos && i + insn->off + 1 > pos) - insn->off += delta; - else if (i > pos + delta && i + insn->off + 1 <= pos + delta) - insn->off -= delta; - } +static struct bpf_prog *bpf_patch_insn_data(struct verifier_env *env, u32 off, + const struct bpf_insn *patch, u32 len) +{ + struct bpf_prog *new_prog; + + new_prog = bpf_patch_insn_single(env->prog, off, patch, len); + if (!new_prog) + return NULL; + if (adjust_insn_aux_data(env, new_prog->len, off, len)) + return NULL; + return new_prog; } /* convert load instructions that access fields of 'struct __sk_buff' @@ -2124,17 +2158,18 @@ static void adjust_branches(struct bpf_prog *prog, int pos, int delta) static int convert_ctx_accesses(struct verifier_env *env) { struct bpf_insn *insn = env->prog->insnsi; - int insn_cnt = env->prog->len; + const int insn_cnt = env->prog->len; struct bpf_insn insn_buf[16]; struct bpf_prog *new_prog; - u32 cnt; - int i; enum bpf_access_type type; + int i, delta = 0; if (!env->prog->aux->ops->convert_ctx_access) return 0; for (i = 0; i < insn_cnt; i++, insn++) { + u32 cnt; + if (insn->code == (BPF_LDX | BPF_MEM | BPF_W)) type = BPF_READ; else if (insn->code == (BPF_STX | BPF_MEM | BPF_W)) @@ -2142,11 +2177,8 @@ static int convert_ctx_accesses(struct verifier_env *env) else continue; - if (insn->imm != PTR_TO_CTX) { - /* clear internal mark */ - insn->imm = 0; + if (env->insn_aux_data[i + delta].ptr_type != PTR_TO_CTX) continue; - } cnt = env->prog->aux->ops-> convert_ctx_access(type, insn->dst_reg, insn->src_reg, @@ -2156,34 +2188,89 @@ static int convert_ctx_accesses(struct verifier_env *env) return -EINVAL; } - if (cnt == 1) { - memcpy(insn, insn_buf, sizeof(*insn)); - continue; - } - - /* several new insns need to be inserted. Make room for them */ - insn_cnt += cnt - 1; - new_prog = bpf_prog_realloc(env->prog, - bpf_prog_size(insn_cnt), - GFP_USER); + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); if (!new_prog) return -ENOMEM; - new_prog->len = insn_cnt; + delta += cnt - 1; + + /* keep walking new program and skip insns we just inserted */ + env->prog = new_prog; + insn = new_prog->insnsi + i + delta; + } - memmove(new_prog->insnsi + i + cnt, new_prog->insns + i + 1, - sizeof(*insn) * (insn_cnt - i - cnt)); + return 0; +} - /* copy substitute insns in place of load instruction */ - memcpy(new_prog->insnsi + i, insn_buf, sizeof(*insn) * cnt); +/* fixup insn->imm field of bpf_call instructions + * + * this function is called after eBPF program passed verification + */ +static int fixup_bpf_calls(struct verifier_env *env) +{ + struct bpf_prog *prog = env->prog; + struct bpf_insn *insn = prog->insnsi; + const struct bpf_func_proto *fn; + const int insn_cnt = prog->len; + struct bpf_insn insn_buf[16]; + struct bpf_prog *new_prog; + struct bpf_map *map_ptr; + int i, cnt, delta = 0; - /* adjust branches in the whole program */ - adjust_branches(new_prog, i, cnt - 1); + for (i = 0; i < insn_cnt; i++, insn++) { + if (insn->code != (BPF_JMP | BPF_CALL)) + continue; - /* keep walking new program and skip insns we just inserted */ - env->prog = new_prog; - insn = new_prog->insnsi + i + cnt - 1; - i += cnt - 1; + if (insn->imm == BPF_FUNC_get_route_realm) + prog->dst_needed = 1; + if (insn->imm == BPF_FUNC_get_prandom_u32) + bpf_user_rnd_init_once(); + if (insn->imm == BPF_FUNC_tail_call) { + /* mark bpf_tail_call as different opcode to avoid + * conditional branch in the interpeter for every normal + * call and to prevent accidental JITing by JIT compiler + * that doesn't support bpf_tail_call yet + */ + insn->imm = 0; + insn->code |= BPF_X; + + /* instead of changing every JIT dealing with tail_call + * emit two extra insns: + * if (index >= max_entries) goto out; + * index &= array->index_mask; + * to avoid out-of-bounds cpu speculation + */ + map_ptr = env->insn_aux_data[i + delta].map_ptr; + if (!map_ptr->unpriv_array) + continue; + insn_buf[0] = BPF_JMP_IMM(BPF_JGE, BPF_REG_3, + map_ptr->max_entries, 2); + insn_buf[1] = BPF_ALU32_IMM(BPF_AND, BPF_REG_3, + container_of(map_ptr, + struct bpf_array, + map)->index_mask); + insn_buf[2] = *insn; + cnt = 3; + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); + if (!new_prog) + return -ENOMEM; + + delta += cnt - 1; + env->prog = prog = new_prog; + insn = new_prog->insnsi + i + delta; + continue; + } + + fn = prog->aux->ops->get_func_proto(insn->imm); + /* all functions that have prototype and verifier allowed + * programs to call them, must be real in-kernel functions + */ + if (!fn->func) { + verbose("kernel subsystem misconfigured func %d\n", + insn->imm); + return -EFAULT; + } + insn->imm = fn->func - __bpf_call_base; } return 0; @@ -2227,6 +2314,11 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr) if (!env) return -ENOMEM; + env->insn_aux_data = vzalloc(sizeof(struct bpf_insn_aux_data) * + (*prog)->len); + ret = -ENOMEM; + if (!env->insn_aux_data) + goto err_free_env; env->prog = *prog; /* grab the mutex to protect few globals used by verifier */ @@ -2245,12 +2337,12 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr) /* log_* values have to be sane */ if (log_size < 128 || log_size > UINT_MAX >> 8 || log_level == 0 || log_ubuf == NULL) - goto free_env; + goto err_unlock; ret = -ENOMEM; log_buf = vmalloc(log_size); if (!log_buf) - goto free_env; + goto err_unlock; } else { log_level = 0; } @@ -2282,6 +2374,9 @@ skip_full_check: /* program is valid, convert *(u32*)(ctx + off) accesses */ ret = convert_ctx_accesses(env); + if (ret == 0) + ret = fixup_bpf_calls(env); + if (log_level && log_len >= log_size - 1) { BUG_ON(log_len >= log_size); /* verifier log exceeded user supplied buffer */ @@ -2319,14 +2414,16 @@ skip_full_check: free_log_buf: if (log_level) vfree(log_buf); -free_env: if (!env->prog->aux->used_maps) /* if we didn't copy map pointers into bpf_prog_info, release * them now. Otherwise free_bpf_prog_info() will release them. */ release_maps(env); *prog = env->prog; - kfree(env); +err_unlock: mutex_unlock(&bpf_verifier_lock); + vfree(env->insn_aux_data); +err_free_env: + kfree(env); return ret; } diff --git a/kernel/futex.c b/kernel/futex.c index 3057dabf726f..fc68462801de 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -1939,8 +1939,12 @@ static int unqueue_me(struct futex_q *q) /* In the common case we don't take the spinlock, which is nice. */ retry: - lock_ptr = q->lock_ptr; - barrier(); + /* + * q->lock_ptr can change between this read and the following spin_lock. + * Use READ_ONCE to forbid the compiler from reloading q->lock_ptr and + * optimizing lock_ptr out of the logic below. + */ + lock_ptr = READ_ONCE(q->lock_ptr); if (lock_ptr != NULL) { spin_lock(lock_ptr); /* diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 89350f924c85..79d2d765a75f 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -719,6 +719,7 @@ static inline void __mutex_unlock_common_slowpath(struct mutex *lock, int nested) { unsigned long flags; + WAKE_Q(wake_q); /* * As a performance measurement, release the lock before doing other @@ -746,11 +747,11 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested) struct mutex_waiter, list); debug_mutex_wake_waiter(lock, waiter); - - wake_up_process(waiter->task); + wake_q_add(&wake_q, waiter->task); } spin_unlock_mutex(&lock->wait_lock, flags); + wake_up_q(&wake_q); } /* diff --git a/mm/compaction.c b/mm/compaction.c index dba02dec7195..b6f145ed7ae1 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -200,7 +200,8 @@ static void reset_cached_positions(struct zone *zone) { zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn; zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn; - zone->compact_cached_free_pfn = zone_end_pfn(zone); + zone->compact_cached_free_pfn = + round_down(zone_end_pfn(zone) - 1, pageblock_nr_pages); } /* @@ -552,13 +553,17 @@ unsigned long isolate_freepages_range(struct compact_control *cc, unsigned long start_pfn, unsigned long end_pfn) { - unsigned long isolated, pfn, block_end_pfn; + unsigned long isolated, pfn, block_start_pfn, block_end_pfn; LIST_HEAD(freelist); pfn = start_pfn; + block_start_pfn = pfn & ~(pageblock_nr_pages - 1); + if (block_start_pfn < cc->zone->zone_start_pfn) + block_start_pfn = cc->zone->zone_start_pfn; block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages); for (; pfn < end_pfn; pfn += isolated, + block_start_pfn = block_end_pfn, block_end_pfn += pageblock_nr_pages) { /* Protect pfn from changing by isolate_freepages_block */ unsigned long isolate_start_pfn = pfn; @@ -571,11 +576,13 @@ isolate_freepages_range(struct compact_control *cc, * scanning range to right one. */ if (pfn >= block_end_pfn) { + block_start_pfn = pfn & ~(pageblock_nr_pages - 1); block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages); block_end_pfn = min(block_end_pfn, end_pfn); } - if (!pageblock_pfn_to_page(pfn, block_end_pfn, cc->zone)) + if (!pageblock_pfn_to_page(block_start_pfn, + block_end_pfn, cc->zone)) break; isolated = isolate_freepages_block(cc, &isolate_start_pfn, @@ -861,18 +868,23 @@ unsigned long isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn, unsigned long end_pfn) { - unsigned long pfn, block_end_pfn; + unsigned long pfn, block_start_pfn, block_end_pfn; /* Scan block by block. First and last block may be incomplete */ pfn = start_pfn; + block_start_pfn = pfn & ~(pageblock_nr_pages - 1); + if (block_start_pfn < cc->zone->zone_start_pfn) + block_start_pfn = cc->zone->zone_start_pfn; block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages); for (; pfn < end_pfn; pfn = block_end_pfn, + block_start_pfn = block_end_pfn, block_end_pfn += pageblock_nr_pages) { block_end_pfn = min(block_end_pfn, end_pfn); - if (!pageblock_pfn_to_page(pfn, block_end_pfn, cc->zone)) + if (!pageblock_pfn_to_page(block_start_pfn, + block_end_pfn, cc->zone)) continue; pfn = isolate_migratepages_block(cc, pfn, block_end_pfn, @@ -1090,7 +1102,9 @@ int sysctl_compact_unevictable_allowed __read_mostly = 1; static isolate_migrate_t isolate_migratepages(struct zone *zone, struct compact_control *cc) { - unsigned long low_pfn, end_pfn; + unsigned long block_start_pfn; + unsigned long block_end_pfn; + unsigned long low_pfn; unsigned long isolate_start_pfn; struct page *page; const isolate_mode_t isolate_mode = @@ -1102,16 +1116,21 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, * initialized by compact_zone() */ low_pfn = cc->migrate_pfn; + block_start_pfn = cc->migrate_pfn & ~(pageblock_nr_pages - 1); + if (block_start_pfn < zone->zone_start_pfn) + block_start_pfn = zone->zone_start_pfn; /* Only scan within a pageblock boundary */ - end_pfn = ALIGN(low_pfn + 1, pageblock_nr_pages); + block_end_pfn = ALIGN(low_pfn + 1, pageblock_nr_pages); /* * Iterate over whole pageblocks until we find the first suitable. * Do not cross the free scanner. */ - for (; end_pfn <= cc->free_pfn; - low_pfn = end_pfn, end_pfn += pageblock_nr_pages) { + for (; block_end_pfn <= cc->free_pfn; + low_pfn = block_end_pfn, + block_start_pfn = block_end_pfn, + block_end_pfn += pageblock_nr_pages) { /* * This can potentially iterate a massively long zone with @@ -1122,7 +1141,8 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, && compact_should_abort(cc)) break; - page = pageblock_pfn_to_page(low_pfn, end_pfn, zone); + page = pageblock_pfn_to_page(block_start_pfn, block_end_pfn, + zone); if (!page) continue; @@ -1141,8 +1161,8 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, /* Perform the isolation */ isolate_start_pfn = low_pfn; - low_pfn = isolate_migratepages_block(cc, low_pfn, end_pfn, - isolate_mode); + low_pfn = isolate_migratepages_block(cc, low_pfn, + block_end_pfn, isolate_mode); if (!low_pfn || cc->contended) { acct_isolated(zone, cc); @@ -1358,11 +1378,11 @@ static int compact_zone(struct zone *zone, struct compact_control *cc) */ cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync]; cc->free_pfn = zone->compact_cached_free_pfn; - if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) { - cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1); + if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) { + cc->free_pfn = round_down(end_pfn - 1, pageblock_nr_pages); zone->compact_cached_free_pfn = cc->free_pfn; } - if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) { + if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) { cc->migrate_pfn = start_pfn; zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn; zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn; diff --git a/mm/page-writeback.c b/mm/page-writeback.c index fd51ebfc423f..6d0dbde4503b 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -1162,6 +1162,7 @@ static void wb_update_dirty_ratelimit(struct dirty_throttle_control *dtc, unsigned long balanced_dirty_ratelimit; unsigned long step; unsigned long x; + unsigned long shift; /* * The dirty rate will match the writeout rate in long term, except @@ -1286,11 +1287,11 @@ static void wb_update_dirty_ratelimit(struct dirty_throttle_control *dtc, * rate itself is constantly fluctuating. So decrease the track speed * when it gets close to the target. Helps eliminate pointless tremors. */ - step >>= dirty_ratelimit / (2 * step + 1); - /* - * Limit the tracking speed to avoid overshooting. - */ - step = (step + 7) / 8; + shift = dirty_ratelimit / (2 * step + 1); + if (shift < BITS_PER_LONG) + step = DIV_ROUND_UP(step >> shift, 8); + else + step = 0; if (dirty_ratelimit < balanced_dirty_ratelimit) dirty_ratelimit += step; diff --git a/mm/zswap.c b/mm/zswap.c index 45476f429789..568015e2fe7a 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -123,7 +123,7 @@ struct zswap_pool { struct crypto_comp * __percpu *tfm; struct kref kref; struct list_head list; - struct rcu_head rcu_head; + struct work_struct work; struct notifier_block notifier; char tfm_name[CRYPTO_MAX_ALG_NAME]; }; @@ -667,9 +667,11 @@ static int __must_check zswap_pool_get(struct zswap_pool *pool) return kref_get_unless_zero(&pool->kref); } -static void __zswap_pool_release(struct rcu_head *head) +static void __zswap_pool_release(struct work_struct *work) { - struct zswap_pool *pool = container_of(head, typeof(*pool), rcu_head); + struct zswap_pool *pool = container_of(work, typeof(*pool), work); + + synchronize_rcu(); /* nobody should have been able to get a kref... */ WARN_ON(kref_get_unless_zero(&pool->kref)); @@ -689,7 +691,9 @@ static void __zswap_pool_empty(struct kref *kref) WARN_ON(pool == zswap_pool_current()); list_del_rcu(&pool->list); - call_rcu(&pool->rcu_head, __zswap_pool_release); + + INIT_WORK(&pool->work, __zswap_pool_release); + schedule_work(&pool->work); spin_unlock(&zswap_pools_lock); } @@ -748,18 +752,22 @@ static int __zswap_param_set(const char *val, const struct kernel_param *kp, pool = zswap_pool_find_get(type, compressor); if (pool) { zswap_pool_debug("using existing", pool); + WARN_ON(pool == zswap_pool_current()); list_del_rcu(&pool->list); - } else { - spin_unlock(&zswap_pools_lock); - pool = zswap_pool_create(type, compressor); - spin_lock(&zswap_pools_lock); } + spin_unlock(&zswap_pools_lock); + + if (!pool) + pool = zswap_pool_create(type, compressor); + if (pool) ret = param_set_charp(s, kp); else ret = -EINVAL; + spin_lock(&zswap_pools_lock); + if (!ret) { put_pool = zswap_pool_current(); list_add_rcu(&pool->list, &zswap_pools); diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c index 01abb6431fd9..e2713b0794ae 100644 --- a/net/8021q/vlan.c +++ b/net/8021q/vlan.c @@ -111,12 +111,7 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head) vlan_gvrp_uninit_applicant(real_dev); } - /* Take it out of our own structures, but be sure to interlock with - * HW accelerating devices or SW vlan input packet processing if - * VLAN is not 0 (leave it there for 802.1p). - */ - if (vlan_id) - vlan_vid_del(real_dev, vlan->vlan_proto, vlan_id); + vlan_vid_del(real_dev, vlan->vlan_proto, vlan_id); /* Get rid of the vlan's reference to real_dev */ dev_put(real_dev); diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 357bcd34cf1f..af68674690af 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -3342,9 +3342,10 @@ static int l2cap_parse_conf_req(struct l2cap_chan *chan, void *data, size_t data break; case L2CAP_CONF_EFS: - remote_efs = 1; - if (olen == sizeof(efs)) + if (olen == sizeof(efs)) { + remote_efs = 1; memcpy(&efs, (void *) val, olen); + } break; case L2CAP_CONF_EWS: @@ -3563,16 +3564,17 @@ static int l2cap_parse_conf_rsp(struct l2cap_chan *chan, void *rsp, int len, break; case L2CAP_CONF_EFS: - if (olen == sizeof(efs)) + if (olen == sizeof(efs)) { memcpy(&efs, (void *)val, olen); - if (chan->local_stype != L2CAP_SERV_NOTRAFIC && - efs.stype != L2CAP_SERV_NOTRAFIC && - efs.stype != chan->local_stype) - return -ECONNREFUSED; + if (chan->local_stype != L2CAP_SERV_NOTRAFIC && + efs.stype != L2CAP_SERV_NOTRAFIC && + efs.stype != chan->local_stype) + return -ECONNREFUSED; - l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs), - (unsigned long) &efs, endptr - ptr); + l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs), + (unsigned long) &efs, endptr - ptr); + } break; case L2CAP_CONF_FCS: diff --git a/net/core/sock_diag.c b/net/core/sock_diag.c index 0c1d58d43f67..a47f693f9f14 100644 --- a/net/core/sock_diag.c +++ b/net/core/sock_diag.c @@ -289,7 +289,7 @@ static int sock_diag_bind(struct net *net, int group) case SKNLGRP_INET6_UDP_DESTROY: if (!sock_diag_handlers[AF_INET6]) request_module("net-pf-%d-proto-%d-type-%d", PF_NETLINK, - NETLINK_SOCK_DIAG, AF_INET); + NETLINK_SOCK_DIAG, AF_INET6); break; } return 0; diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 1b4f5f2d2929..b809958f7388 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1785,8 +1785,10 @@ struct sk_buff *ip6_make_skb(struct sock *sk, cork.base.opt = NULL; v6_cork.opt = NULL; err = ip6_setup_cork(sk, &cork, &v6_cork, hlimit, tclass, opt, rt, fl6); - if (err) + if (err) { + ip6_cork_release(&cork, &v6_cork); return ERR_PTR(err); + } if (dontfrag < 0) dontfrag = inet6_sk(sk)->dontfrag; diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 97cb02dc5f02..a7170a23ab0b 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1083,10 +1083,11 @@ static int ip6_tnl_xmit2(struct sk_buff *skb, memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr)); neigh_release(neigh); } - } else if (!(t->parms.flags & - (IP6_TNL_F_USE_ORIG_TCLASS | IP6_TNL_F_USE_ORIG_FWMARK))) { - /* enable the cache only only if the routing decision does - * not depend on the current inner header value + } else if (t->parms.proto != 0 && !(t->parms.flags & + (IP6_TNL_F_USE_ORIG_TCLASS | + IP6_TNL_F_USE_ORIG_FWMARK))) { + /* enable the cache only if neither the outer protocol nor the + * routing decision depends on the current inner header value */ use_cache = true; } diff --git a/net/mac80211/debugfs.c b/net/mac80211/debugfs.c index 4d2aaebd4f97..e546a987a9d3 100644 --- a/net/mac80211/debugfs.c +++ b/net/mac80211/debugfs.c @@ -91,7 +91,7 @@ static const struct file_operations reset_ops = { }; #endif -static const char *hw_flag_names[NUM_IEEE80211_HW_FLAGS + 1] = { +static const char *hw_flag_names[] = { #define FLAG(F) [IEEE80211_HW_##F] = #F FLAG(HAS_RATE_CONTROL), FLAG(RX_INCLUDES_FCS), @@ -125,9 +125,6 @@ static const char *hw_flag_names[NUM_IEEE80211_HW_FLAGS + 1] = { FLAG(TDLS_WIDER_BW), FLAG(SUPPORTS_AMSDU_IN_AMPDU), FLAG(BEACON_TX_STATUS), - - /* keep last for the build bug below */ - (void *)0x1 #undef FLAG }; @@ -147,7 +144,7 @@ static ssize_t hwflags_read(struct file *file, char __user *user_buf, /* fail compilation if somebody adds or removes * a flag without updating the name array above */ - BUILD_BUG_ON(hw_flag_names[NUM_IEEE80211_HW_FLAGS] != (void *)0x1); + BUILD_BUG_ON(ARRAY_SIZE(hw_flag_names) != NUM_IEEE80211_HW_FLAGS); for (i = 0; i < NUM_IEEE80211_HW_FLAGS; i++) { if (test_bit(i, local->hw.flags)) diff --git a/net/rds/rdma.c b/net/rds/rdma.c index bdf151c6307d..bdfc395d1be2 100644 --- a/net/rds/rdma.c +++ b/net/rds/rdma.c @@ -517,6 +517,9 @@ int rds_rdma_extra_size(struct rds_rdma_args *args) local_vec = (struct rds_iovec __user *)(unsigned long) args->local_vec_addr; + if (args->nr_local == 0) + return -EINVAL; + /* figure out the number of pages in the vector */ for (i = 0; i < args->nr_local; i++) { if (copy_from_user(&vec, &local_vec[i], @@ -866,6 +869,7 @@ int rds_cmsg_atomic(struct rds_sock *rs, struct rds_message *rm, err: if (page) put_page(page); + rm->atomic.op_active = 0; kfree(rm->atomic.op_notifier); return ret; diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c index 33e72c809e50..494b7b533366 100644 --- a/sound/core/oss/pcm_oss.c +++ b/sound/core/oss/pcm_oss.c @@ -465,7 +465,6 @@ static int snd_pcm_hw_param_near(struct snd_pcm_substream *pcm, v = snd_pcm_hw_param_last(pcm, params, var, dir); else v = snd_pcm_hw_param_first(pcm, params, var, dir); - snd_BUG_ON(v < 0); return v; } @@ -1370,8 +1369,11 @@ static ssize_t snd_pcm_oss_write1(struct snd_pcm_substream *substream, const cha if ((tmp = snd_pcm_oss_make_ready(substream)) < 0) return tmp; - mutex_lock(&runtime->oss.params_lock); while (bytes > 0) { + if (mutex_lock_interruptible(&runtime->oss.params_lock)) { + tmp = -ERESTARTSYS; + break; + } if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) { tmp = bytes; if (tmp + runtime->oss.buffer_used > runtime->oss.period_bytes) @@ -1415,14 +1417,18 @@ static ssize_t snd_pcm_oss_write1(struct snd_pcm_substream *substream, const cha xfer += tmp; if ((substream->f_flags & O_NONBLOCK) != 0 && tmp != runtime->oss.period_bytes) - break; + tmp = -EAGAIN; } - } - mutex_unlock(&runtime->oss.params_lock); - return xfer; - err: - mutex_unlock(&runtime->oss.params_lock); + mutex_unlock(&runtime->oss.params_lock); + if (tmp < 0) + break; + if (signal_pending(current)) { + tmp = -ERESTARTSYS; + break; + } + tmp = 0; + } return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp; } @@ -1470,8 +1476,11 @@ static ssize_t snd_pcm_oss_read1(struct snd_pcm_substream *substream, char __use if ((tmp = snd_pcm_oss_make_ready(substream)) < 0) return tmp; - mutex_lock(&runtime->oss.params_lock); while (bytes > 0) { + if (mutex_lock_interruptible(&runtime->oss.params_lock)) { + tmp = -ERESTARTSYS; + break; + } if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) { if (runtime->oss.buffer_used == 0) { tmp = snd_pcm_oss_read2(substream, runtime->oss.buffer, runtime->oss.period_bytes, 1); @@ -1502,12 +1511,16 @@ static ssize_t snd_pcm_oss_read1(struct snd_pcm_substream *substream, char __use bytes -= tmp; xfer += tmp; } - } - mutex_unlock(&runtime->oss.params_lock); - return xfer; - err: - mutex_unlock(&runtime->oss.params_lock); + mutex_unlock(&runtime->oss.params_lock); + if (tmp < 0) + break; + if (signal_pending(current)) { + tmp = -ERESTARTSYS; + break; + } + tmp = 0; + } return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp; } diff --git a/sound/core/oss/pcm_plugin.c b/sound/core/oss/pcm_plugin.c index 727ac44d39f4..a84a1d3d23e5 100644 --- a/sound/core/oss/pcm_plugin.c +++ b/sound/core/oss/pcm_plugin.c @@ -591,18 +591,26 @@ snd_pcm_sframes_t snd_pcm_plug_write_transfer(struct snd_pcm_substream *plug, st snd_pcm_sframes_t frames = size; plugin = snd_pcm_plug_first(plug); - while (plugin && frames > 0) { + while (plugin) { + if (frames <= 0) + return frames; if ((next = plugin->next) != NULL) { snd_pcm_sframes_t frames1 = frames; - if (plugin->dst_frames) + if (plugin->dst_frames) { frames1 = plugin->dst_frames(plugin, frames); + if (frames1 <= 0) + return frames1; + } if ((err = next->client_channels(next, frames1, &dst_channels)) < 0) { return err; } if (err != frames1) { frames = err; - if (plugin->src_frames) + if (plugin->src_frames) { frames = plugin->src_frames(plugin, frames1); + if (frames <= 0) + return frames; + } } } else dst_channels = NULL; diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index cd20f91326fe..7b805766306e 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -1664,7 +1664,7 @@ int snd_pcm_hw_param_first(struct snd_pcm_substream *pcm, return changed; if (params->rmask) { int err = snd_pcm_hw_refine(pcm, params); - if (snd_BUG_ON(err < 0)) + if (err < 0) return err; } return snd_pcm_hw_param_value(params, var, dir); @@ -1711,7 +1711,7 @@ int snd_pcm_hw_param_last(struct snd_pcm_substream *pcm, return changed; if (params->rmask) { int err = snd_pcm_hw_refine(pcm, params); - if (snd_BUG_ON(err < 0)) + if (err < 0) return err; } return snd_pcm_hw_param_value(params, var, dir); diff --git a/sound/drivers/aloop.c b/sound/drivers/aloop.c index 54f348a4fb78..cbd20cb8ca11 100644 --- a/sound/drivers/aloop.c +++ b/sound/drivers/aloop.c @@ -39,6 +39,7 @@ #include <sound/core.h> #include <sound/control.h> #include <sound/pcm.h> +#include <sound/pcm_params.h> #include <sound/info.h> #include <sound/initval.h> @@ -305,19 +306,6 @@ static int loopback_trigger(struct snd_pcm_substream *substream, int cmd) return 0; } -static void params_change_substream(struct loopback_pcm *dpcm, - struct snd_pcm_runtime *runtime) -{ - struct snd_pcm_runtime *dst_runtime; - - if (dpcm == NULL || dpcm->substream == NULL) - return; - dst_runtime = dpcm->substream->runtime; - if (dst_runtime == NULL) - return; - dst_runtime->hw = dpcm->cable->hw; -} - static void params_change(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; @@ -329,10 +317,6 @@ static void params_change(struct snd_pcm_substream *substream) cable->hw.rate_max = runtime->rate; cable->hw.channels_min = runtime->channels; cable->hw.channels_max = runtime->channels; - params_change_substream(cable->streams[SNDRV_PCM_STREAM_PLAYBACK], - runtime); - params_change_substream(cable->streams[SNDRV_PCM_STREAM_CAPTURE], - runtime); } static int loopback_prepare(struct snd_pcm_substream *substream) @@ -620,26 +604,29 @@ static unsigned int get_cable_index(struct snd_pcm_substream *substream) static int rule_format(struct snd_pcm_hw_params *params, struct snd_pcm_hw_rule *rule) { + struct loopback_pcm *dpcm = rule->private; + struct loopback_cable *cable = dpcm->cable; + struct snd_mask m; - struct snd_pcm_hardware *hw = rule->private; - struct snd_mask *maskp = hw_param_mask(params, rule->var); - - maskp->bits[0] &= (u_int32_t)hw->formats; - maskp->bits[1] &= (u_int32_t)(hw->formats >> 32); - memset(maskp->bits + 2, 0, (SNDRV_MASK_MAX-64) / 8); /* clear rest */ - if (! maskp->bits[0] && ! maskp->bits[1]) - return -EINVAL; - return 0; + snd_mask_none(&m); + mutex_lock(&dpcm->loopback->cable_lock); + m.bits[0] = (u_int32_t)cable->hw.formats; + m.bits[1] = (u_int32_t)(cable->hw.formats >> 32); + mutex_unlock(&dpcm->loopback->cable_lock); + return snd_mask_refine(hw_param_mask(params, rule->var), &m); } static int rule_rate(struct snd_pcm_hw_params *params, struct snd_pcm_hw_rule *rule) { - struct snd_pcm_hardware *hw = rule->private; + struct loopback_pcm *dpcm = rule->private; + struct loopback_cable *cable = dpcm->cable; struct snd_interval t; - t.min = hw->rate_min; - t.max = hw->rate_max; + mutex_lock(&dpcm->loopback->cable_lock); + t.min = cable->hw.rate_min; + t.max = cable->hw.rate_max; + mutex_unlock(&dpcm->loopback->cable_lock); t.openmin = t.openmax = 0; t.integer = 0; return snd_interval_refine(hw_param_interval(params, rule->var), &t); @@ -648,22 +635,44 @@ static int rule_rate(struct snd_pcm_hw_params *params, static int rule_channels(struct snd_pcm_hw_params *params, struct snd_pcm_hw_rule *rule) { - struct snd_pcm_hardware *hw = rule->private; + struct loopback_pcm *dpcm = rule->private; + struct loopback_cable *cable = dpcm->cable; struct snd_interval t; - t.min = hw->channels_min; - t.max = hw->channels_max; + mutex_lock(&dpcm->loopback->cable_lock); + t.min = cable->hw.channels_min; + t.max = cable->hw.channels_max; + mutex_unlock(&dpcm->loopback->cable_lock); t.openmin = t.openmax = 0; t.integer = 0; return snd_interval_refine(hw_param_interval(params, rule->var), &t); } +static void free_cable(struct snd_pcm_substream *substream) +{ + struct loopback *loopback = substream->private_data; + int dev = get_cable_index(substream); + struct loopback_cable *cable; + + cable = loopback->cables[substream->number][dev]; + if (!cable) + return; + if (cable->streams[!substream->stream]) { + /* other stream is still alive */ + cable->streams[substream->stream] = NULL; + } else { + /* free the cable */ + loopback->cables[substream->number][dev] = NULL; + kfree(cable); + } +} + static int loopback_open(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct loopback *loopback = substream->private_data; struct loopback_pcm *dpcm; - struct loopback_cable *cable; + struct loopback_cable *cable = NULL; int err = 0; int dev = get_cable_index(substream); @@ -682,7 +691,6 @@ static int loopback_open(struct snd_pcm_substream *substream) if (!cable) { cable = kzalloc(sizeof(*cable), GFP_KERNEL); if (!cable) { - kfree(dpcm); err = -ENOMEM; goto unlock; } @@ -700,19 +708,19 @@ static int loopback_open(struct snd_pcm_substream *substream) /* are cached -> they do not reflect the actual state */ err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_FORMAT, - rule_format, &runtime->hw, + rule_format, dpcm, SNDRV_PCM_HW_PARAM_FORMAT, -1); if (err < 0) goto unlock; err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_RATE, - rule_rate, &runtime->hw, + rule_rate, dpcm, SNDRV_PCM_HW_PARAM_RATE, -1); if (err < 0) goto unlock; err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS, - rule_channels, &runtime->hw, + rule_channels, dpcm, SNDRV_PCM_HW_PARAM_CHANNELS, -1); if (err < 0) goto unlock; @@ -724,6 +732,10 @@ static int loopback_open(struct snd_pcm_substream *substream) else runtime->hw = cable->hw; unlock: + if (err < 0) { + free_cable(substream); + kfree(dpcm); + } mutex_unlock(&loopback->cable_lock); return err; } @@ -732,20 +744,10 @@ static int loopback_close(struct snd_pcm_substream *substream) { struct loopback *loopback = substream->private_data; struct loopback_pcm *dpcm = substream->runtime->private_data; - struct loopback_cable *cable; - int dev = get_cable_index(substream); loopback_timer_stop(dpcm); mutex_lock(&loopback->cable_lock); - cable = loopback->cables[substream->number][dev]; - if (cable->streams[!substream->stream]) { - /* other stream is still alive */ - cable->streams[substream->stream] = NULL; - } else { - /* free the cable */ - loopback->cables[substream->number][dev] = NULL; - kfree(cable); - } + free_cable(substream); mutex_unlock(&loopback->cable_lock); return 0; } diff --git a/tools/testing/selftests/vm/Makefile b/tools/testing/selftests/vm/Makefile index b5f08e8cab33..e4bb1de1d526 100644 --- a/tools/testing/selftests/vm/Makefile +++ b/tools/testing/selftests/vm/Makefile @@ -1,9 +1,5 @@ # Makefile for vm selftests -ifndef OUTPUT - OUTPUT := $(shell pwd) -endif - CFLAGS = -Wall -I ../../../../usr/include $(EXTRA_CFLAGS) BINARIES = compaction_test BINARIES += hugepage-mmap diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile index eabcff411984..92d7eff2827a 100644 --- a/tools/testing/selftests/x86/Makefile +++ b/tools/testing/selftests/x86/Makefile @@ -4,7 +4,8 @@ include ../lib.mk .PHONY: all all_32 all_64 warn_32bit_failure clean -TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs ldt_gdt syscall_nt ptrace_syscall +TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs ldt_gdt syscall_nt ptrace_syscall \ + test_vsyscall TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault sigreturn test_syscall_vdso unwind_vdso \ test_FCMOV test_FCOMI test_FISTTP diff --git a/tools/testing/selftests/x86/test_vsyscall.c b/tools/testing/selftests/x86/test_vsyscall.c new file mode 100644 index 000000000000..6e0bd52ad53d --- /dev/null +++ b/tools/testing/selftests/x86/test_vsyscall.c @@ -0,0 +1,500 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#define _GNU_SOURCE + +#include <stdio.h> +#include <sys/time.h> +#include <time.h> +#include <stdlib.h> +#include <sys/syscall.h> +#include <unistd.h> +#include <dlfcn.h> +#include <string.h> +#include <inttypes.h> +#include <signal.h> +#include <sys/ucontext.h> +#include <errno.h> +#include <err.h> +#include <sched.h> +#include <stdbool.h> +#include <setjmp.h> + +#ifdef __x86_64__ +# define VSYS(x) (x) +#else +# define VSYS(x) 0 +#endif + +#ifndef SYS_getcpu +# ifdef __x86_64__ +# define SYS_getcpu 309 +# else +# define SYS_getcpu 318 +# endif +#endif + +static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *), + int flags) +{ + struct sigaction sa; + memset(&sa, 0, sizeof(sa)); + sa.sa_sigaction = handler; + sa.sa_flags = SA_SIGINFO | flags; + sigemptyset(&sa.sa_mask); + if (sigaction(sig, &sa, 0)) + err(1, "sigaction"); +} + +/* vsyscalls and vDSO */ +bool should_read_vsyscall = false; + +typedef long (*gtod_t)(struct timeval *tv, struct timezone *tz); +gtod_t vgtod = (gtod_t)VSYS(0xffffffffff600000); +gtod_t vdso_gtod; + +typedef int (*vgettime_t)(clockid_t, struct timespec *); +vgettime_t vdso_gettime; + +typedef long (*time_func_t)(time_t *t); +time_func_t vtime = (time_func_t)VSYS(0xffffffffff600400); +time_func_t vdso_time; + +typedef long (*getcpu_t)(unsigned *, unsigned *, void *); +getcpu_t vgetcpu = (getcpu_t)VSYS(0xffffffffff600800); +getcpu_t vdso_getcpu; + +static void init_vdso(void) +{ + void *vdso = dlopen("linux-vdso.so.1", RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD); + if (!vdso) + vdso = dlopen("linux-gate.so.1", RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD); + if (!vdso) { + printf("[WARN]\tfailed to find vDSO\n"); + return; + } + + vdso_gtod = (gtod_t)dlsym(vdso, "__vdso_gettimeofday"); + if (!vdso_gtod) + printf("[WARN]\tfailed to find gettimeofday in vDSO\n"); + + vdso_gettime = (vgettime_t)dlsym(vdso, "__vdso_clock_gettime"); + if (!vdso_gettime) + printf("[WARN]\tfailed to find clock_gettime in vDSO\n"); + + vdso_time = (time_func_t)dlsym(vdso, "__vdso_time"); + if (!vdso_time) + printf("[WARN]\tfailed to find time in vDSO\n"); + + vdso_getcpu = (getcpu_t)dlsym(vdso, "__vdso_getcpu"); + if (!vdso_getcpu) { + /* getcpu() was never wired up in the 32-bit vDSO. */ + printf("[%s]\tfailed to find getcpu in vDSO\n", + sizeof(long) == 8 ? "WARN" : "NOTE"); + } +} + +static int init_vsys(void) +{ +#ifdef __x86_64__ + int nerrs = 0; + FILE *maps; + char line[128]; + bool found = false; + + maps = fopen("/proc/self/maps", "r"); + if (!maps) { + printf("[WARN]\tCould not open /proc/self/maps -- assuming vsyscall is r-x\n"); + should_read_vsyscall = true; + return 0; + } + + while (fgets(line, sizeof(line), maps)) { + char r, x; + void *start, *end; + char name[128]; + if (sscanf(line, "%p-%p %c-%cp %*x %*x:%*x %*u %s", + &start, &end, &r, &x, name) != 5) + continue; + + if (strcmp(name, "[vsyscall]")) + continue; + + printf("\tvsyscall map: %s", line); + + if (start != (void *)0xffffffffff600000 || + end != (void *)0xffffffffff601000) { + printf("[FAIL]\taddress range is nonsense\n"); + nerrs++; + } + + printf("\tvsyscall permissions are %c-%c\n", r, x); + should_read_vsyscall = (r == 'r'); + if (x != 'x') { + vgtod = NULL; + vtime = NULL; + vgetcpu = NULL; + } + + found = true; + break; + } + + fclose(maps); + + if (!found) { + printf("\tno vsyscall map in /proc/self/maps\n"); + should_read_vsyscall = false; + vgtod = NULL; + vtime = NULL; + vgetcpu = NULL; + } + + return nerrs; +#else + return 0; +#endif +} + +/* syscalls */ +static inline long sys_gtod(struct timeval *tv, struct timezone *tz) +{ + return syscall(SYS_gettimeofday, tv, tz); +} + +static inline int sys_clock_gettime(clockid_t id, struct timespec *ts) +{ + return syscall(SYS_clock_gettime, id, ts); +} + +static inline long sys_time(time_t *t) +{ + return syscall(SYS_time, t); +} + +static inline long sys_getcpu(unsigned * cpu, unsigned * node, + void* cache) +{ + return syscall(SYS_getcpu, cpu, node, cache); +} + +static jmp_buf jmpbuf; + +static void sigsegv(int sig, siginfo_t *info, void *ctx_void) +{ + siglongjmp(jmpbuf, 1); +} + +static double tv_diff(const struct timeval *a, const struct timeval *b) +{ + return (double)(a->tv_sec - b->tv_sec) + + (double)((int)a->tv_usec - (int)b->tv_usec) * 1e-6; +} + +static int check_gtod(const struct timeval *tv_sys1, + const struct timeval *tv_sys2, + const struct timezone *tz_sys, + const char *which, + const struct timeval *tv_other, + const struct timezone *tz_other) +{ + int nerrs = 0; + double d1, d2; + + if (tz_other && (tz_sys->tz_minuteswest != tz_other->tz_minuteswest || tz_sys->tz_dsttime != tz_other->tz_dsttime)) { + printf("[FAIL] %s tz mismatch\n", which); + nerrs++; + } + + d1 = tv_diff(tv_other, tv_sys1); + d2 = tv_diff(tv_sys2, tv_other); + printf("\t%s time offsets: %lf %lf\n", which, d1, d2); + + if (d1 < 0 || d2 < 0) { + printf("[FAIL]\t%s time was inconsistent with the syscall\n", which); + nerrs++; + } else { + printf("[OK]\t%s gettimeofday()'s timeval was okay\n", which); + } + + return nerrs; +} + +static int test_gtod(void) +{ + struct timeval tv_sys1, tv_sys2, tv_vdso, tv_vsys; + struct timezone tz_sys, tz_vdso, tz_vsys; + long ret_vdso = -1; + long ret_vsys = -1; + int nerrs = 0; + + printf("[RUN]\ttest gettimeofday()\n"); + + if (sys_gtod(&tv_sys1, &tz_sys) != 0) + err(1, "syscall gettimeofday"); + if (vdso_gtod) + ret_vdso = vdso_gtod(&tv_vdso, &tz_vdso); + if (vgtod) + ret_vsys = vgtod(&tv_vsys, &tz_vsys); + if (sys_gtod(&tv_sys2, &tz_sys) != 0) + err(1, "syscall gettimeofday"); + + if (vdso_gtod) { + if (ret_vdso == 0) { + nerrs += check_gtod(&tv_sys1, &tv_sys2, &tz_sys, "vDSO", &tv_vdso, &tz_vdso); + } else { + printf("[FAIL]\tvDSO gettimeofday() failed: %ld\n", ret_vdso); + nerrs++; + } + } + + if (vgtod) { + if (ret_vsys == 0) { + nerrs += check_gtod(&tv_sys1, &tv_sys2, &tz_sys, "vsyscall", &tv_vsys, &tz_vsys); + } else { + printf("[FAIL]\tvsys gettimeofday() failed: %ld\n", ret_vsys); + nerrs++; + } + } + + return nerrs; +} + +static int test_time(void) { + int nerrs = 0; + + printf("[RUN]\ttest time()\n"); + long t_sys1, t_sys2, t_vdso = 0, t_vsys = 0; + long t2_sys1 = -1, t2_sys2 = -1, t2_vdso = -1, t2_vsys = -1; + t_sys1 = sys_time(&t2_sys1); + if (vdso_time) + t_vdso = vdso_time(&t2_vdso); + if (vtime) + t_vsys = vtime(&t2_vsys); + t_sys2 = sys_time(&t2_sys2); + if (t_sys1 < 0 || t_sys1 != t2_sys1 || t_sys2 < 0 || t_sys2 != t2_sys2) { + printf("[FAIL]\tsyscall failed (ret1:%ld output1:%ld ret2:%ld output2:%ld)\n", t_sys1, t2_sys1, t_sys2, t2_sys2); + nerrs++; + return nerrs; + } + + if (vdso_time) { + if (t_vdso < 0 || t_vdso != t2_vdso) { + printf("[FAIL]\tvDSO failed (ret:%ld output:%ld)\n", t_vdso, t2_vdso); + nerrs++; + } else if (t_vdso < t_sys1 || t_vdso > t_sys2) { + printf("[FAIL]\tvDSO returned the wrong time (%ld %ld %ld)\n", t_sys1, t_vdso, t_sys2); + nerrs++; + } else { + printf("[OK]\tvDSO time() is okay\n"); + } + } + + if (vtime) { + if (t_vsys < 0 || t_vsys != t2_vsys) { + printf("[FAIL]\tvsyscall failed (ret:%ld output:%ld)\n", t_vsys, t2_vsys); + nerrs++; + } else if (t_vsys < t_sys1 || t_vsys > t_sys2) { + printf("[FAIL]\tvsyscall returned the wrong time (%ld %ld %ld)\n", t_sys1, t_vsys, t_sys2); + nerrs++; + } else { + printf("[OK]\tvsyscall time() is okay\n"); + } + } + + return nerrs; +} + +static int test_getcpu(int cpu) +{ + int nerrs = 0; + long ret_sys, ret_vdso = -1, ret_vsys = -1; + + printf("[RUN]\tgetcpu() on CPU %d\n", cpu); + + cpu_set_t cpuset; + CPU_ZERO(&cpuset); + CPU_SET(cpu, &cpuset); + if (sched_setaffinity(0, sizeof(cpuset), &cpuset) != 0) { + printf("[SKIP]\tfailed to force CPU %d\n", cpu); + return nerrs; + } + + unsigned cpu_sys, cpu_vdso, cpu_vsys, node_sys, node_vdso, node_vsys; + unsigned node = 0; + bool have_node = false; + ret_sys = sys_getcpu(&cpu_sys, &node_sys, 0); + if (vdso_getcpu) + ret_vdso = vdso_getcpu(&cpu_vdso, &node_vdso, 0); + if (vgetcpu) + ret_vsys = vgetcpu(&cpu_vsys, &node_vsys, 0); + + if (ret_sys == 0) { + if (cpu_sys != cpu) { + printf("[FAIL]\tsyscall reported CPU %hu but should be %d\n", cpu_sys, cpu); + nerrs++; + } + + have_node = true; + node = node_sys; + } + + if (vdso_getcpu) { + if (ret_vdso) { + printf("[FAIL]\tvDSO getcpu() failed\n"); + nerrs++; + } else { + if (!have_node) { + have_node = true; + node = node_vdso; + } + + if (cpu_vdso != cpu) { + printf("[FAIL]\tvDSO reported CPU %hu but should be %d\n", cpu_vdso, cpu); + nerrs++; + } else { + printf("[OK]\tvDSO reported correct CPU\n"); + } + + if (node_vdso != node) { + printf("[FAIL]\tvDSO reported node %hu but should be %hu\n", node_vdso, node); + nerrs++; + } else { + printf("[OK]\tvDSO reported correct node\n"); + } + } + } + + if (vgetcpu) { + if (ret_vsys) { + printf("[FAIL]\tvsyscall getcpu() failed\n"); + nerrs++; + } else { + if (!have_node) { + have_node = true; + node = node_vsys; + } + + if (cpu_vsys != cpu) { + printf("[FAIL]\tvsyscall reported CPU %hu but should be %d\n", cpu_vsys, cpu); + nerrs++; + } else { + printf("[OK]\tvsyscall reported correct CPU\n"); + } + + if (node_vsys != node) { + printf("[FAIL]\tvsyscall reported node %hu but should be %hu\n", node_vsys, node); + nerrs++; + } else { + printf("[OK]\tvsyscall reported correct node\n"); + } + } + } + + return nerrs; +} + +static int test_vsys_r(void) +{ +#ifdef __x86_64__ + printf("[RUN]\tChecking read access to the vsyscall page\n"); + bool can_read; + if (sigsetjmp(jmpbuf, 1) == 0) { + *(volatile int *)0xffffffffff600000; + can_read = true; + } else { + can_read = false; + } + + if (can_read && !should_read_vsyscall) { + printf("[FAIL]\tWe have read access, but we shouldn't\n"); + return 1; + } else if (!can_read && should_read_vsyscall) { + printf("[FAIL]\tWe don't have read access, but we should\n"); + return 1; + } else { + printf("[OK]\tgot expected result\n"); + } +#endif + + return 0; +} + + +#ifdef __x86_64__ +#define X86_EFLAGS_TF (1UL << 8) +static volatile sig_atomic_t num_vsyscall_traps; + +static unsigned long get_eflags(void) +{ + unsigned long eflags; + asm volatile ("pushfq\n\tpopq %0" : "=rm" (eflags)); + return eflags; +} + +static void set_eflags(unsigned long eflags) +{ + asm volatile ("pushq %0\n\tpopfq" : : "rm" (eflags) : "flags"); +} + +static void sigtrap(int sig, siginfo_t *info, void *ctx_void) +{ + ucontext_t *ctx = (ucontext_t *)ctx_void; + unsigned long ip = ctx->uc_mcontext.gregs[REG_RIP]; + + if (((ip ^ 0xffffffffff600000UL) & ~0xfffUL) == 0) + num_vsyscall_traps++; +} + +static int test_native_vsyscall(void) +{ + time_t tmp; + bool is_native; + + if (!vtime) + return 0; + + printf("[RUN]\tchecking for native vsyscall\n"); + sethandler(SIGTRAP, sigtrap, 0); + set_eflags(get_eflags() | X86_EFLAGS_TF); + vtime(&tmp); + set_eflags(get_eflags() & ~X86_EFLAGS_TF); + + /* + * If vsyscalls are emulated, we expect a single trap in the + * vsyscall page -- the call instruction will trap with RIP + * pointing to the entry point before emulation takes over. + * In native mode, we expect two traps, since whatever code + * the vsyscall page contains will be more than just a ret + * instruction. + */ + is_native = (num_vsyscall_traps > 1); + + printf("\tvsyscalls are %s (%d instructions in vsyscall page)\n", + (is_native ? "native" : "emulated"), + (int)num_vsyscall_traps); + + return 0; +} +#endif + +int main(int argc, char **argv) +{ + int nerrs = 0; + + init_vdso(); + nerrs += init_vsys(); + + nerrs += test_gtod(); + nerrs += test_time(); + nerrs += test_getcpu(0); + nerrs += test_getcpu(1); + + sethandler(SIGSEGV, sigsegv, 0); + nerrs += test_vsys_r(); + +#ifdef __x86_64__ + nerrs += test_native_vsyscall(); +#endif + + return nerrs ? 1 : 0; +}

7 years, 8 months

1
0
0 0

Re: [Linux-stable-mirror] Linux 3.18.92

by Greg KH

diff --git a/Makefile b/Makefile index d114d0641a7e..d2e18e2dc1fb 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ VERSION = 3 PATCHLEVEL = 18 -SUBLEVEL = 91 +SUBLEVEL = 92 EXTRAVERSION = NAME = Diseased Newt diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c index d23073284ad9..8b19ef037253 100644 --- a/arch/mips/kernel/ptrace.c +++ b/arch/mips/kernel/ptrace.c @@ -400,25 +400,38 @@ static int gpr64_set(struct task_struct *target, #endif /* CONFIG_64BIT */ -static int fpr_get(struct task_struct *target, - const struct user_regset *regset, - unsigned int pos, unsigned int count, - void *kbuf, void __user *ubuf) +/* + * Copy the floating-point context to the supplied NT_PRFPREG buffer, + * !CONFIG_CPU_HAS_MSA variant. FP context's general register slots + * correspond 1:1 to buffer slots. Only general registers are copied. + */ +static int fpr_get_fpa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + void **kbuf, void __user **ubuf) { - unsigned i; - int err; - u64 fpr_val; - - /* XXX fcr31 */ + return user_regset_copyout(pos, count, kbuf, ubuf, + &target->thread.fpu, + 0, NUM_FPU_REGS * sizeof(elf_fpreg_t)); +} - if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t)) - return user_regset_copyout(&pos, &count, &kbuf, &ubuf, - &target->thread.fpu, - 0, sizeof(elf_fpregset_t)); +/* + * Copy the floating-point context to the supplied NT_PRFPREG buffer, + * CONFIG_CPU_HAS_MSA variant. Only lower 64 bits of FP context's + * general register slots are copied to buffer slots. Only general + * registers are copied. + */ +static int fpr_get_msa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + void **kbuf, void __user **ubuf) +{ + unsigned int i; + u64 fpr_val; + int err; + BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); for (i = 0; i < NUM_FPU_REGS; i++) { fpr_val = get_fpr64(&target->thread.fpu.fpr[i], 0); - err = user_regset_copyout(&pos, &count, &kbuf, &ubuf, + err = user_regset_copyout(pos, count, kbuf, ubuf, &fpr_val, i * sizeof(elf_fpreg_t), (i + 1) * sizeof(elf_fpreg_t)); if (err) @@ -428,25 +441,64 @@ static int fpr_get(struct task_struct *target, return 0; } -static int fpr_set(struct task_struct *target, +/* + * Copy the floating-point context to the supplied NT_PRFPREG buffer. + * Choose the appropriate helper for general registers, and then copy + * the FCSR register separately. + */ +static int fpr_get(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, - const void *kbuf, const void __user *ubuf) + void *kbuf, void __user *ubuf) { - unsigned i; + const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t); int err; - u64 fpr_val; - /* XXX fcr31 */ + if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t)) + err = fpr_get_fpa(target, &pos, &count, &kbuf, &ubuf); + else + err = fpr_get_msa(target, &pos, &count, &kbuf, &ubuf); + if (err) + return err; - if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t)) - return user_regset_copyin(&pos, &count, &kbuf, &ubuf, - &target->thread.fpu, - 0, sizeof(elf_fpregset_t)); + err = user_regset_copyout(&pos, &count, &kbuf, &ubuf, + &target->thread.fpu.fcr31, + fcr31_pos, fcr31_pos + sizeof(u32)); + + return err; +} + +/* + * Copy the supplied NT_PRFPREG buffer to the floating-point context, + * !CONFIG_CPU_HAS_MSA variant. Buffer slots correspond 1:1 to FP + * context's general register slots. Only general registers are copied. + */ +static int fpr_set_fpa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + const void **kbuf, const void __user **ubuf) +{ + return user_regset_copyin(pos, count, kbuf, ubuf, + &target->thread.fpu, + 0, NUM_FPU_REGS * sizeof(elf_fpreg_t)); +} + +/* + * Copy the supplied NT_PRFPREG buffer to the floating-point context, + * CONFIG_CPU_HAS_MSA variant. Buffer slots are copied to lower 64 + * bits only of FP context's general register slots. Only general + * registers are copied. + */ +static int fpr_set_msa(struct task_struct *target, + unsigned int *pos, unsigned int *count, + const void **kbuf, const void __user **ubuf) +{ + unsigned int i; + u64 fpr_val; + int err; BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t)); - for (i = 0; i < NUM_FPU_REGS && count >= sizeof(elf_fpreg_t); i++) { - err = user_regset_copyin(&pos, &count, &kbuf, &ubuf, + for (i = 0; i < NUM_FPU_REGS && *count > 0; i++) { + err = user_regset_copyin(pos, count, kbuf, ubuf, &fpr_val, i * sizeof(elf_fpreg_t), (i + 1) * sizeof(elf_fpreg_t)); if (err) @@ -457,6 +509,51 @@ static int fpr_set(struct task_struct *target, return 0; } +/* + * Copy the supplied NT_PRFPREG buffer to the floating-point context. + * Choose the appropriate helper for general registers, and then copy + * the FCSR register separately. + * + * We optimize for the case where `count % sizeof(elf_fpreg_t) == 0', + * which is supposed to have been guaranteed by the kernel before + * calling us, e.g. in `ptrace_regset'. We enforce that requirement, + * so that we can safely avoid preinitializing temporaries for + * partial register writes. + */ +static int fpr_set(struct task_struct *target, + const struct user_regset *regset, + unsigned int pos, unsigned int count, + const void *kbuf, const void __user *ubuf) +{ + const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t); + u32 fcr31; + int err; + + BUG_ON(count % sizeof(elf_fpreg_t)); + + if (pos + count > sizeof(elf_fpregset_t)) + return -EIO; + + if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t)) + err = fpr_set_fpa(target, &pos, &count, &kbuf, &ubuf); + else + err = fpr_set_msa(target, &pos, &count, &kbuf, &ubuf); + if (err) + return err; + + if (count > 0) { + err = user_regset_copyin(&pos, &count, &kbuf, &ubuf, + &fcr31, + fcr31_pos, fcr31_pos + sizeof(u32)); + if (err) + return err; + + target->thread.fpu.fcr31 = fcr31 & ~FPU_CSR_ALL_X; + } + + return err; +} + enum mips_regset { REGSET_GPR, REGSET_FPR, diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index 690a25967a10..f460a63473f0 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -308,13 +308,12 @@ acpi_parse_lapic_nmi(struct acpi_subtable_header * header, const unsigned long e #ifdef CONFIG_X86_IO_APIC #define MP_ISA_BUS 0 +static int __init mp_register_ioapic_irq(u8 bus_irq, u8 polarity, + u8 trigger, u32 gsi); + static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, u32 gsi) { - int ioapic; - int pin; - struct mpc_intsrc mp_irq; - /* * Check bus_irq boundary. */ @@ -323,14 +322,6 @@ static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, return; } - /* - * Convert 'gsi' to 'ioapic.pin'. - */ - ioapic = mp_find_ioapic(gsi); - if (ioapic < 0) - return; - pin = mp_find_ioapic_pin(ioapic, gsi); - /* * TBD: This check is for faulty timer entries, where the override * erroneously sets the trigger to level, resulting in a HUGE @@ -339,16 +330,8 @@ static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, if ((bus_irq == 0) && (trigger == 3)) trigger = 1; - mp_irq.type = MP_INTSRC; - mp_irq.irqtype = mp_INT; - mp_irq.irqflag = (trigger << 2) | polarity; - mp_irq.srcbus = MP_ISA_BUS; - mp_irq.srcbusirq = bus_irq; /* IRQ */ - mp_irq.dstapic = mpc_ioapic_id(ioapic); /* APIC ID */ - mp_irq.dstirq = pin; /* INTIN# */ - - mp_save_irq(&mp_irq); - + if (mp_register_ioapic_irq(bus_irq, polarity, trigger, gsi) < 0) + return; /* * Reset default identity mapping if gsi is also an legacy IRQ, * otherwise there will be more than one entry with the same GSI @@ -445,6 +428,34 @@ static struct irq_domain_ops acpi_irqdomain_ops = { .unmap = mp_irqdomain_unmap, }; +static int __init mp_register_ioapic_irq(u8 bus_irq, u8 polarity, + u8 trigger, u32 gsi) +{ + struct mpc_intsrc mp_irq; + int ioapic, pin; + + /* Convert 'gsi' to 'ioapic.pin'(INTIN#) */ + ioapic = mp_find_ioapic(gsi); + if (ioapic < 0) { + pr_warn("Failed to find ioapic for gsi : %u\n", gsi); + return ioapic; + } + + pin = mp_find_ioapic_pin(ioapic, gsi); + + mp_irq.type = MP_INTSRC; + mp_irq.irqtype = mp_INT; + mp_irq.irqflag = (trigger << 2) | polarity; + mp_irq.srcbus = MP_ISA_BUS; + mp_irq.srcbusirq = bus_irq; + mp_irq.dstapic = mpc_ioapic_id(ioapic); + mp_irq.dstirq = pin; + + mp_save_irq(&mp_irq); + + return 0; +} + static int __init acpi_parse_ioapic(struct acpi_subtable_header * header, const unsigned long end) { @@ -489,7 +500,10 @@ static void __init acpi_sci_ioapic_setup(u8 bus_irq, u16 polarity, u16 trigger, if (acpi_sci_flags & ACPI_MADT_POLARITY_MASK) polarity = acpi_sci_flags & ACPI_MADT_POLARITY_MASK; - mp_override_legacy_irq(bus_irq, polarity, trigger, gsi); + if (bus_irq < NR_IRQS_LEGACY) + mp_override_legacy_irq(bus_irq, polarity, trigger, gsi); + else + mp_register_ioapic_irq(bus_irq, polarity, trigger, gsi); /* * stash over-ride to indicate we've been here diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c index 0dd7bcae3f5b..bd9603279524 100644 --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -271,8 +271,17 @@ static bool is_blacklisted(unsigned int cpu) { struct cpuinfo_x86 *c = &cpu_data(cpu); - if (c->x86 == 6 && c->x86_model == 79) { - pr_err_once("late loading on model 79 is disabled.\n"); + /* + * Late loading on model 79 with microcode revision less than 0x0b000021 + * may result in a system hang. This behavior is documented in item + * BDF90, #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family). + */ + if (c->x86 == 6 && + c->x86_model == 79 && + c->x86_mask == 0x01 && + c->microcode < 0x0b000021) { + pr_err_once("Erratum BDF90: late loading with revision < 0x0b000021 (0x%x) disabled.\n", c->microcode); + pr_err_once("Please consider either early loading through initrd/built-in or a potential BIOS update.\n"); return true; } diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index c59e8f5c2e2f..36414d13289f 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -3945,6 +3945,25 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) "mov %%r13, %c[r13](%[svm]) \n\t" "mov %%r14, %c[r14](%[svm]) \n\t" "mov %%r15, %c[r15](%[svm]) \n\t" +#endif + /* + * Clear host registers marked as clobbered to prevent + * speculative use. + */ + "xor %%" _ASM_BX ", %%" _ASM_BX " \n\t" + "xor %%" _ASM_CX ", %%" _ASM_CX " \n\t" + "xor %%" _ASM_DX ", %%" _ASM_DX " \n\t" + "xor %%" _ASM_SI ", %%" _ASM_SI " \n\t" + "xor %%" _ASM_DI ", %%" _ASM_DI " \n\t" +#ifdef CONFIG_X86_64 + "xor %%r8, %%r8 \n\t" + "xor %%r9, %%r9 \n\t" + "xor %%r10, %%r10 \n\t" + "xor %%r11, %%r11 \n\t" + "xor %%r12, %%r12 \n\t" + "xor %%r13, %%r13 \n\t" + "xor %%r14, %%r14 \n\t" + "xor %%r15, %%r15 \n\t" #endif "pop %%" _ASM_BP : diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index bdbd5d3fc98f..c1fd6d3d4394 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -7653,6 +7653,7 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) /* Save guest registers, load host registers, keep flags */ "mov %0, %c[wordsize](%%" _ASM_SP ") \n\t" "pop %0 \n\t" + "setbe %c[fail](%0)\n\t" "mov %%" _ASM_AX ", %c[rax](%0) \n\t" "mov %%" _ASM_BX ", %c[rbx](%0) \n\t" __ASM_SIZE(pop) " %c[rcx](%0) \n\t" @@ -7669,12 +7670,23 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) "mov %%r13, %c[r13](%0) \n\t" "mov %%r14, %c[r14](%0) \n\t" "mov %%r15, %c[r15](%0) \n\t" + "xor %%r8d, %%r8d \n\t" + "xor %%r9d, %%r9d \n\t" + "xor %%r10d, %%r10d \n\t" + "xor %%r11d, %%r11d \n\t" + "xor %%r12d, %%r12d \n\t" + "xor %%r13d, %%r13d \n\t" + "xor %%r14d, %%r14d \n\t" + "xor %%r15d, %%r15d \n\t" #endif "mov %%cr2, %%" _ASM_AX " \n\t" "mov %%" _ASM_AX ", %c[cr2](%0) \n\t" + "xor %%eax, %%eax \n\t" + "xor %%ebx, %%ebx \n\t" + "xor %%esi, %%esi \n\t" + "xor %%edi, %%edi \n\t" "pop %%" _ASM_BP "; pop %%" _ASM_DX " \n\t" - "setbe %c[fail](%0) \n\t" ".pushsection .rodata \n\t" ".global vmx_return \n\t" "vmx_return: " _ASM_PTR " 2b \n\t" diff --git a/crypto/algapi.c b/crypto/algapi.c index 314cc745f2f8..bfa509412edf 100644 --- a/crypto/algapi.c +++ b/crypto/algapi.c @@ -159,6 +159,18 @@ void crypto_remove_spawns(struct crypto_alg *alg, struct list_head *list, spawn->alg = NULL; spawns = &inst->alg.cra_users; + + /* + * We may encounter an unregistered instance here, since + * an instance's spawns are set up prior to the instance + * being registered. An unregistered instance will have + * NULL ->cra_users.next, since ->cra_users isn't + * properly initialized until registration. But an + * unregistered instance cannot have any users, so treat + * it the same as ->cra_users being empty. + */ + if (spawns->next == NULL) + break; } } while ((spawns = crypto_more_spawns(alg, &stack, &top, &secondary_spawns))); diff --git a/drivers/crypto/n2_core.c b/drivers/crypto/n2_core.c index f8e3207fecb1..d62a148075bf 100644 --- a/drivers/crypto/n2_core.c +++ b/drivers/crypto/n2_core.c @@ -1641,6 +1641,7 @@ static int queue_cache_init(void) CWQ_ENTRY_SIZE, 0, NULL); if (!queue_cache[HV_NCS_QTYPE_CWQ - 1]) { kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_MAU - 1]); + queue_cache[HV_NCS_QTYPE_MAU - 1] = NULL; return -ENOMEM; } return 0; @@ -1650,6 +1651,8 @@ static void queue_cache_destroy(void) { kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_MAU - 1]); kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_CWQ - 1]); + queue_cache[HV_NCS_QTYPE_MAU - 1] = NULL; + queue_cache[HV_NCS_QTYPE_CWQ - 1] = NULL; } static int spu_queue_register(struct spu_queue *p, unsigned long q_type) diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index 9d72100a0394..587311887d94 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -959,8 +959,7 @@ static int srpt_init_ch_qp(struct srpt_rdma_ch *ch, struct ib_qp *qp) return -ENOMEM; attr->qp_state = IB_QPS_INIT; - attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_READ | - IB_ACCESS_REMOTE_WRITE; + attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE; attr->port_num = ch->sport->port; attr->pkey_index = 0; diff --git a/drivers/input/mouse/elantech.c b/drivers/input/mouse/elantech.c index 47a1a20f1b16..b46103878b6d 100644 --- a/drivers/input/mouse/elantech.c +++ b/drivers/input/mouse/elantech.c @@ -1558,7 +1558,7 @@ static int elantech_set_properties(struct elantech_data *etd) case 5: etd->hw_version = 3; break; - case 6 ... 14: + case 6 ... 15: etd->hw_version = 4; break; default: diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c index 94adbcf74e7c..4c13320f7a1e 100644 --- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -430,7 +430,7 @@ static int gs_usb_set_bittiming(struct net_device *netdev) dev_err(netdev->dev.parent, "Couldn't set bittimings (err=%d)", rc); - return rc; + return (rc > 0) ? 0 : rc; } static void gs_usb_xmit_callback(struct urb *urb) diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c index 4745164e4325..355914a31591 100644 --- a/drivers/net/can/usb/kvaser_usb.c +++ b/drivers/net/can/usb/kvaser_usb.c @@ -602,7 +602,6 @@ static int kvaser_usb_simple_msg_async(struct kvaser_usb_net_priv *priv, if (err) { netdev_err(netdev, "Error transmitting URB\n"); usb_unanchor_urb(urb); - kfree(buf); usb_free_urb(urb); kfree(buf); return err; @@ -1389,7 +1388,6 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb, atomic_dec(&priv->active_tx_urbs); usb_unanchor_urb(urb); - kfree(buf); stats->tx_dropped++; diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c index feb618468d15..227f83371e84 100644 --- a/drivers/net/ethernet/intel/e1000e/ich8lan.c +++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c @@ -1299,6 +1299,9 @@ out: * Checks to see of the link status of the hardware has changed. If a * change in link status has been detected, then we read the PHY registers * to get the current speed/duplex if link exists. + * + * Returns a negative error code (-E1000_ERR_*) or 0 (link down) or 1 (link + * up). **/ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) { @@ -1313,7 +1316,7 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) * Change or Rx Sequence Error interrupt. */ if (!mac->get_link_status) - return 0; + return 1; /* First we want to see if the MII Status Register reports * link. If so, then we want to get the current speed/duplex @@ -1452,10 +1455,12 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw) * different link partner. */ ret_val = e1000e_config_fc_after_link_up(hw); - if (ret_val) + if (ret_val) { e_dbg("Error configuring flow control\n"); + return ret_val; + } - return ret_val; + return 1; } static s32 e1000_get_variants_ich8lan(struct e1000_adapter *adapter) diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c index 24e5d2135663..b89d7c16991d 100644 --- a/drivers/net/ethernet/renesas/sh_eth.c +++ b/drivers/net/ethernet/renesas/sh_eth.c @@ -2890,18 +2890,37 @@ static int sh_eth_drv_probe(struct platform_device *pdev) /* ioremap the TSU registers */ if (mdp->cd->tsu) { struct resource *rtsu; + rtsu = platform_get_resource(pdev, IORESOURCE_MEM, 1); - mdp->tsu_addr = devm_ioremap_resource(&pdev->dev, rtsu); - if (IS_ERR(mdp->tsu_addr)) { - ret = PTR_ERR(mdp->tsu_addr); + if (!rtsu) { + dev_err(&pdev->dev, "no TSU resource\n"); + ret = -ENODEV; + goto out_release; + } + /* We can only request the TSU region for the first port + * of the two sharing this TSU for the probe to succeed... + */ + if (devno % 2 == 0 && + !devm_request_mem_region(&pdev->dev, rtsu->start, + resource_size(rtsu), + dev_name(&pdev->dev))) { + dev_err(&pdev->dev, "can't request TSU resource.\n"); + ret = -EBUSY; + goto out_release; + } + mdp->tsu_addr = devm_ioremap(&pdev->dev, rtsu->start, + resource_size(rtsu)); + if (!mdp->tsu_addr) { + dev_err(&pdev->dev, "TSU region ioremap() failed.\n"); + ret = -ENOMEM; goto out_release; } mdp->port = devno % 2; ndev->features = NETIF_F_HW_VLAN_CTAG_FILTER; } - /* initialize first or needed device */ - if (!devno || pd->needs_init) { + /* Need to init only the first port of the two sharing a TSU */ + if (devno % 2 == 0) { if (mdp->cd->chip_reset) mdp->cd->chip_reset(ndev); diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index c769da8d6f3a..103ae8ef8643 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -277,8 +277,14 @@ bool stmmac_eee_init(struct stmmac_priv *priv) { char *phy_bus_name = priv->plat->phy_bus_name; unsigned long flags; + int interface = priv->plat->interface; bool ret = false; + if ((interface != PHY_INTERFACE_MODE_MII) && + (interface != PHY_INTERFACE_MODE_GMII) && + !phy_interface_mode_is_rgmii(interface)) + goto out; + /* Using PCS we cannot dial with the phy registers at this stage * so we do not support extra feature like EEE. */ diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index 658d640022be..8b0211d7d7a6 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -759,10 +759,12 @@ static long ashmem_ioctl(struct file *file, unsigned int cmd, unsigned long arg) break; case ASHMEM_SET_SIZE: ret = -EINVAL; + mutex_lock(&ashmem_mutex); if (!asma->file) { ret = 0; asma->size = (size_t) arg; } + mutex_unlock(&ashmem_mutex); break; case ASHMEM_GET_SIZE: ret = asma->size; diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 9582f082152c..185059773f1b 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -1750,7 +1750,6 @@ iscsit_handle_task_mgt_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd, struct iscsi_tmr_req *tmr_req; struct iscsi_tm *hdr; int out_of_order_cmdsn = 0, ret; - bool sess_ref = false; u8 function, tcm_function = TMR_UNKNOWN; hdr = (struct iscsi_tm *) buf; @@ -1792,19 +1791,17 @@ iscsit_handle_task_mgt_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd, buf); } + transport_init_se_cmd(&cmd->se_cmd, &lio_target_fabric_configfs->tf_ops, + conn->sess->se_sess, 0, DMA_NONE, + MSG_SIMPLE_TAG, cmd->sense_buffer + 2); + + target_get_sess_cmd(&cmd->se_cmd, true); + /* * TASK_REASSIGN for ERL=2 / connection stays inside of * LIO-Target $FABRIC_MOD */ if (function != ISCSI_TM_FUNC_TASK_REASSIGN) { - transport_init_se_cmd(&cmd->se_cmd, - &lio_target_fabric_configfs->tf_ops, - conn->sess->se_sess, 0, DMA_NONE, - MSG_SIMPLE_TAG, cmd->sense_buffer + 2); - - target_get_sess_cmd(&cmd->se_cmd, true); - sess_ref = true; - switch (function) { case ISCSI_TM_FUNC_ABORT_TASK: tcm_function = TMR_ABORT_TASK; @@ -1943,12 +1940,8 @@ attach: * For connection recovery, this is also the default action for * TMR TASK_REASSIGN. */ - if (sess_ref) { - pr_debug("Handle TMR, using sess_ref=true check\n"); - target_put_sess_cmd(&cmd->se_cmd); - } - iscsit_add_cmd_to_response_queue(cmd, conn, cmd->i_state); + target_put_sess_cmd(&cmd->se_cmd); return 0; } EXPORT_SYMBOL(iscsit_handle_task_mgt_cmd); diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c index 9b1a4000d0f9..93724738000a 100644 --- a/drivers/target/target_core_tmr.c +++ b/drivers/target/target_core_tmr.c @@ -137,6 +137,15 @@ static bool __target_check_io_state(struct se_cmd *se_cmd, spin_unlock(&se_cmd->t_state_lock); return false; } + if (se_cmd->transport_state & CMD_T_PRE_EXECUTE) { + if (se_cmd->scsi_status) { + pr_debug("Attempted to abort io tag: %u early failure" + " status: 0x%02x\n", se_cmd->se_tfo->get_task_tag(se_cmd), + se_cmd->scsi_status); + spin_unlock(&se_cmd->t_state_lock); + return false; + } + } if (sess->sess_tearing_down || se_cmd->cmd_wait_set) { pr_debug("Attempted to abort io tag: %u already shutdown," " skipping\n", se_cmd->se_tfo->get_task_tag(se_cmd)); diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index e59c8b3fd4f1..26afe1c74ef4 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -1796,6 +1796,7 @@ void target_execute_cmd(struct se_cmd *cmd) } cmd->t_state = TRANSPORT_PROCESSING; + cmd->transport_state &= ~CMD_T_PRE_EXECUTE; cmd->transport_state |= CMD_T_ACTIVE|CMD_T_BUSY|CMD_T_SENT; spin_unlock_irq(&cmd->t_state_lock); /* @@ -2436,6 +2437,7 @@ int target_get_sess_cmd(struct se_cmd *se_cmd, bool ack_kref) ret = -ESHUTDOWN; goto out; } + se_cmd->transport_state |= CMD_T_PRE_EXECUTE; list_add_tail(&se_cmd->se_cmd_list, &se_sess->sess_cmd_list); out: spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); diff --git a/drivers/usb/misc/usb3503.c b/drivers/usb/misc/usb3503.c index ae7e1206ca54..215d89096c0a 100644 --- a/drivers/usb/misc/usb3503.c +++ b/drivers/usb/misc/usb3503.c @@ -291,6 +291,8 @@ static int usb3503_probe(struct usb3503 *hub) if (gpio_is_valid(hub->gpio_reset)) { err = devm_gpio_request_one(dev, hub->gpio_reset, GPIOF_OUT_INIT_LOW, "usb3503 reset"); + /* Datasheet defines a hardware reset to be at least 100us */ + usleep_range(100, 10000); if (err) { dev_err(dev, "unable to request GPIO %d as reset pin (%d)\n", diff --git a/drivers/usb/mon/mon_bin.c b/drivers/usb/mon/mon_bin.c index 9a62e89d6dc0..bbec84dd34fb 100644 --- a/drivers/usb/mon/mon_bin.c +++ b/drivers/usb/mon/mon_bin.c @@ -1000,7 +1000,9 @@ static long mon_bin_ioctl(struct file *file, unsigned int cmd, unsigned long arg break; case MON_IOCQ_RING_SIZE: + mutex_lock(&rp->fetch_lock); ret = rp->b_size; + mutex_unlock(&rp->fetch_lock); break; case MON_IOCT_RING_SIZE: @@ -1227,12 +1229,16 @@ static int mon_bin_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf) unsigned long offset, chunk_idx; struct page *pageptr; + mutex_lock(&rp->fetch_lock); offset = vmf->pgoff << PAGE_SHIFT; - if (offset >= rp->b_size) + if (offset >= rp->b_size) { + mutex_unlock(&rp->fetch_lock); return VM_FAULT_SIGBUS; + } chunk_idx = offset / CHUNK_SIZE; pageptr = rp->b_vec[chunk_idx].pg; get_page(pageptr); + mutex_unlock(&rp->fetch_lock); vmf->page = pageptr; return 0; } diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c index c454c2fec2f7..e0e112ca07ac 100644 --- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -119,6 +119,7 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x10C4, 0x846E) }, /* BEI USB Sensor Interface (VCP) */ { USB_DEVICE(0x10C4, 0x8477) }, /* Balluff RFID */ { USB_DEVICE(0x10C4, 0x84B6) }, /* Starizona Hyperion */ + { USB_DEVICE(0x10C4, 0x85A7) }, /* LifeScan OneTouch Verio IQ */ { USB_DEVICE(0x10C4, 0x85EA) }, /* AC-Services IBUS-IF */ { USB_DEVICE(0x10C4, 0x85EB) }, /* AC-Services CIS-IBUS */ { USB_DEVICE(0x10C4, 0x85F8) }, /* Virtenio Preon32 */ @@ -168,6 +169,7 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x1843, 0x0200) }, /* Vaisala USB Instrument Cable */ { USB_DEVICE(0x18EF, 0xE00F) }, /* ELV USB-I2C-Interface */ { USB_DEVICE(0x18EF, 0xE025) }, /* ELV Marble Sound Board 1 */ + { USB_DEVICE(0x18EF, 0xE030) }, /* ELV ALC 8xxx Battery Charger */ { USB_DEVICE(0x18EF, 0xE032) }, /* ELV TFD500 Data Logger */ { USB_DEVICE(0x1901, 0x0190) }, /* GE B850 CP2105 Recorder interface */ { USB_DEVICE(0x1901, 0x0193) }, /* GE B650 CP2104 PMC interface */ diff --git a/drivers/usb/storage/unusual_uas.h b/drivers/usb/storage/unusual_uas.h index 9299c0779999..baf671aef9d0 100644 --- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -153,6 +153,13 @@ UNUSUAL_DEV(0x2109, 0x0711, 0x0000, 0x9999, USB_SC_DEVICE, USB_PR_DEVICE, NULL, US_FL_NO_ATA_1X), +/* Reported-by: Icenowy Zheng <icenowy(a)aosc.io> */ +UNUSUAL_DEV(0x2537, 0x1068, 0x0000, 0x9999, + "Norelsys", + "NS1068X", + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_IGNORE_UAS), + /* Reported-by: Takeo Nakayama <javhera(a)gmx.com> */ UNUSUAL_DEV(0x357d, 0x7788, 0x0000, 0x9999, "JMicron", diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c index e40da7759a0e..9752b93f754e 100644 --- a/drivers/usb/usbip/usbip_common.c +++ b/drivers/usb/usbip/usbip_common.c @@ -103,7 +103,7 @@ static void usbip_dump_usb_device(struct usb_device *udev) dev_dbg(dev, " devnum(%d) devpath(%s) usb speed(%s)", udev->devnum, udev->devpath, usb_speed_string(udev->speed)); - pr_debug("tt %p, ttport %d\n", udev->tt, udev->ttport); + pr_debug("tt hub ttport %d\n", udev->ttport); dev_dbg(dev, " "); for (i = 0; i < 16; i++) @@ -136,12 +136,8 @@ static void usbip_dump_usb_device(struct usb_device *udev) } pr_debug("\n"); - dev_dbg(dev, "parent %p, bus %p\n", udev->parent, udev->bus); - - dev_dbg(dev, - "descriptor %p, config %p, actconfig %p, rawdescriptors %p\n", - &udev->descriptor, udev->config, - udev->actconfig, udev->rawdescriptors); + dev_dbg(dev, "parent %s, bus %s\n", dev_name(&udev->parent->dev), + udev->bus->bus_name); dev_dbg(dev, "have_langid %d, string_langid %d\n", udev->have_langid, udev->string_langid); @@ -249,9 +245,6 @@ void usbip_dump_urb(struct urb *urb) dev = &urb->dev->dev; - dev_dbg(dev, " urb :%p\n", urb); - dev_dbg(dev, " dev :%p\n", urb->dev); - usbip_dump_usb_device(urb->dev); dev_dbg(dev, " pipe :%08x ", urb->pipe); @@ -260,11 +253,9 @@ void usbip_dump_urb(struct urb *urb) dev_dbg(dev, " status :%d\n", urb->status); dev_dbg(dev, " transfer_flags :%08X\n", urb->transfer_flags); - dev_dbg(dev, " transfer_buffer :%p\n", urb->transfer_buffer); dev_dbg(dev, " transfer_buffer_length:%d\n", urb->transfer_buffer_length); dev_dbg(dev, " actual_length :%d\n", urb->actual_length); - dev_dbg(dev, " setup_packet :%p\n", urb->setup_packet); if (urb->setup_packet && usb_pipetype(urb->pipe) == PIPE_CONTROL) usbip_dump_usb_ctrlrequest( @@ -274,8 +265,6 @@ void usbip_dump_urb(struct urb *urb) dev_dbg(dev, " number_of_packets :%d\n", urb->number_of_packets); dev_dbg(dev, " interval :%d\n", urb->interval); dev_dbg(dev, " error_count :%d\n", urb->error_count); - dev_dbg(dev, " context :%p\n", urb->context); - dev_dbg(dev, " complete :%p\n", urb->complete); } EXPORT_SYMBOL_GPL(usbip_dump_urb); diff --git a/include/linux/fscache.h b/include/linux/fscache.h index 115bb81912cc..94a8aae8f9e2 100644 --- a/include/linux/fscache.h +++ b/include/linux/fscache.h @@ -764,7 +764,7 @@ bool fscache_maybe_release_page(struct fscache_cookie *cookie, { if (fscache_cookie_valid(cookie) && PageFsCache(page)) return __fscache_maybe_release_page(cookie, page, gfp); - return false; + return true; } /** diff --git a/include/linux/phy.h b/include/linux/phy.h index bc79f855fc32..2b3ca63e1f4c 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -677,6 +677,17 @@ static inline int phy_write_mmd(struct phy_device *phydev, int devad, return mdiobus_write(phydev->bus, phydev->addr, regnum, val); } +/** + * phy_interface_mode_is_rgmii - Convenience function for testing if a + * PHY interface mode is RGMII (all variants) + * @mode: the phy_interface_t enum + */ +static inline bool phy_interface_mode_is_rgmii(phy_interface_t mode) +{ + return mode >= PHY_INTERFACE_MODE_RGMII && + mode <= PHY_INTERFACE_MODE_RGMII_TXID; +}; + /** * phy_write_mmd_indirect - writes data to the MMD registers * @phydev: The PHY device diff --git a/include/linux/sh_eth.h b/include/linux/sh_eth.h index 8c9131db2b25..b050ef51e27e 100644 --- a/include/linux/sh_eth.h +++ b/include/linux/sh_eth.h @@ -16,7 +16,6 @@ struct sh_eth_plat_data { unsigned char mac_addr[ETH_ALEN]; unsigned no_ether_link:1; unsigned ether_link_active_low:1; - unsigned needs_init:1; }; #endif diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h index ab4bff088898..ea5e809825f8 100644 --- a/include/target/target_core_base.h +++ b/include/target/target_core_base.h @@ -538,6 +538,7 @@ struct se_cmd { #define CMD_T_BUSY (1 << 9) #define CMD_T_TAS (1 << 10) #define CMD_T_FABRIC_STOP (1 << 11) +#define CMD_T_PRE_EXECUTE (1 << 12) spinlock_t t_state_lock; struct completion t_transport_stop_comp; diff --git a/kernel/acct.c b/kernel/acct.c index 33738ef972f3..4a71a814bd35 100644 --- a/kernel/acct.c +++ b/kernel/acct.c @@ -96,7 +96,7 @@ static int check_free_space(struct bsd_acct_struct *acct) { struct kstatfs sbuf; - if (time_is_before_jiffies(acct->needcheck)) + if (time_is_after_jiffies(acct->needcheck)) goto out; /* May block */ diff --git a/kernel/events/core.c b/kernel/events/core.c index 9b12efcefdf7..de3303aab7d6 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7414,6 +7414,37 @@ static void mutex_lock_double(struct mutex *a, struct mutex *b) mutex_lock_nested(b, SINGLE_DEPTH_NESTING); } +/* + * Variation on perf_event_ctx_lock_nested(), except we take two context + * mutexes. + */ +static struct perf_event_context * +__perf_event_ctx_lock_double(struct perf_event *group_leader, + struct perf_event_context *ctx) +{ + struct perf_event_context *gctx; + +again: + rcu_read_lock(); + gctx = ACCESS_ONCE(group_leader->ctx); + if (!atomic_inc_not_zero(&gctx->refcount)) { + rcu_read_unlock(); + goto again; + } + rcu_read_unlock(); + + mutex_lock_double(&gctx->mutex, &ctx->mutex); + + if (group_leader->ctx != gctx) { + mutex_unlock(&ctx->mutex); + mutex_unlock(&gctx->mutex); + put_ctx(gctx); + goto again; + } + + return gctx; +} + /** * sys_perf_event_open - open a performance event, associate it to a task/cpu * @@ -7626,14 +7657,31 @@ SYSCALL_DEFINE5(perf_event_open, } if (move_group) { - gctx = group_leader->ctx; + gctx = __perf_event_ctx_lock_double(group_leader, ctx); + + /* + * Check if we raced against another sys_perf_event_open() call + * moving the software group underneath us. + */ + if (!(group_leader->group_flags & PERF_GROUP_SOFTWARE)) { + /* + * If someone moved the group out from under us, check + * if this new event wound up on the same ctx, if so + * its the regular !move_group case, otherwise fail. + */ + if (gctx != ctx) { + err = -EINVAL; + goto err_locked; + } else { + perf_event_ctx_unlock(group_leader, gctx); + move_group = 0; + } + } /* * See perf_event_ctx_lock() for comments on the details * of swizzling perf_event::ctx. */ - mutex_lock_double(&gctx->mutex, &ctx->mutex); - perf_remove_from_context(group_leader, false); /* @@ -7674,7 +7722,7 @@ SYSCALL_DEFINE5(perf_event_open, perf_unpin_context(ctx); if (move_group) { - mutex_unlock(&gctx->mutex); + perf_event_ctx_unlock(group_leader, gctx); put_ctx(gctx); } mutex_unlock(&ctx->mutex); @@ -7703,6 +7751,11 @@ SYSCALL_DEFINE5(perf_event_open, fd_install(event_fd, event_file); return event_fd; +err_locked: + if (move_group) + perf_event_ctx_unlock(group_leader, gctx); + mutex_unlock(&ctx->mutex); + fput(event_file); err_context: perf_unpin_context(ctx); put_ctx(ctx); diff --git a/kernel/signal.c b/kernel/signal.c index 2e1c5d375a0f..b7df30e8066c 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -72,7 +72,7 @@ static int sig_task_ignored(struct task_struct *t, int sig, bool force) handler = sig_handler(t, sig); if (unlikely(t->signal->flags & SIGNAL_UNKILLABLE) && - handler == SIG_DFL && !force) + handler == SIG_DFL && !(force && sig_kernel_only(sig))) return 1; return sig_handler_ignored(handler, sig); @@ -88,13 +88,15 @@ static int sig_ignored(struct task_struct *t, int sig, bool force) if (sigismember(&t->blocked, sig) || sigismember(&t->real_blocked, sig)) return 0; - if (!sig_task_ignored(t, sig, force)) - return 0; - /* - * Tracers may want to know about even ignored signals. + * Tracers may want to know about even ignored signal unless it + * is SIGKILL which can't be reported anyway but can be ignored + * by SIGNAL_UNKILLABLE task. */ - return !t->ptrace; + if (t->ptrace && sig != SIGKILL) + return 0; + + return sig_task_ignored(t, sig, force); } /* @@ -968,9 +970,9 @@ static void complete_signal(int sig, struct task_struct *p, int group) * then start taking the whole group down immediately. */ if (sig_fatal(p, sig) && - !(signal->flags & (SIGNAL_UNKILLABLE | SIGNAL_GROUP_EXIT)) && + !(signal->flags & SIGNAL_GROUP_EXIT) && !sigismember(&t->real_blocked, sig) && - (sig == SIGKILL || !t->ptrace)) { + (sig == SIGKILL || !p->ptrace)) { /* * This signal will be fatal to the whole group. */ diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c index 6b9c7eaca478..5bf39d5c2691 100644 --- a/net/8021q/vlan.c +++ b/net/8021q/vlan.c @@ -111,12 +111,7 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head) vlan_gvrp_uninit_applicant(real_dev); } - /* Take it out of our own structures, but be sure to interlock with - * HW accelerating devices or SW vlan input packet processing if - * VLAN is not 0 (leave it there for 802.1p). - */ - if (vlan_id) - vlan_vid_del(real_dev, vlan->vlan_proto, vlan_id); + vlan_vid_del(real_dev, vlan->vlan_proto, vlan_id); /* Get rid of the vlan's reference to real_dev */ dev_put(real_dev); diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 238b3b93a66a..0ae6e32ffc17 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -3317,9 +3317,10 @@ static int l2cap_parse_conf_req(struct l2cap_chan *chan, void *data, size_t data break; case L2CAP_CONF_EFS: - remote_efs = 1; - if (olen == sizeof(efs)) + if (olen == sizeof(efs)) { + remote_efs = 1; memcpy(&efs, (void *) val, olen); + } break; case L2CAP_CONF_EWS: @@ -3538,16 +3539,17 @@ static int l2cap_parse_conf_rsp(struct l2cap_chan *chan, void *rsp, int len, break; case L2CAP_CONF_EFS: - if (olen == sizeof(efs)) + if (olen == sizeof(efs)) { memcpy(&efs, (void *)val, olen); - if (chan->local_stype != L2CAP_SERV_NOTRAFIC && - efs.stype != L2CAP_SERV_NOTRAFIC && - efs.stype != chan->local_stype) - return -ECONNREFUSED; + if (chan->local_stype != L2CAP_SERV_NOTRAFIC && + efs.stype != L2CAP_SERV_NOTRAFIC && + efs.stype != chan->local_stype) + return -ECONNREFUSED; - l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs), - (unsigned long) &efs, endptr - ptr); + l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs), + (unsigned long) &efs, endptr - ptr); + } break; case L2CAP_CONF_FCS: diff --git a/net/rds/rdma.c b/net/rds/rdma.c index 612c3050d514..b1ec96bca937 100644 --- a/net/rds/rdma.c +++ b/net/rds/rdma.c @@ -516,6 +516,9 @@ int rds_rdma_extra_size(struct rds_rdma_args *args) local_vec = (struct rds_iovec __user *)(unsigned long) args->local_vec_addr; + if (args->nr_local == 0) + return -EINVAL; + /* figure out the number of pages in the vector */ for (i = 0; i < args->nr_local; i++) { if (copy_from_user(&vec, &local_vec[i], @@ -863,6 +866,7 @@ int rds_cmsg_atomic(struct rds_sock *rs, struct rds_message *rm, err: if (page) put_page(page); + rm->atomic.op_active = 0; kfree(rm->atomic.op_notifier); return ret; diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c index f29f1ce4a455..96612762d623 100644 --- a/sound/core/oss/pcm_oss.c +++ b/sound/core/oss/pcm_oss.c @@ -465,7 +465,6 @@ static int snd_pcm_hw_param_near(struct snd_pcm_substream *pcm, v = snd_pcm_hw_param_last(pcm, params, var, dir); else v = snd_pcm_hw_param_first(pcm, params, var, dir); - snd_BUG_ON(v < 0); return v; } @@ -1371,8 +1370,11 @@ static ssize_t snd_pcm_oss_write1(struct snd_pcm_substream *substream, const cha if ((tmp = snd_pcm_oss_make_ready(substream)) < 0) return tmp; - mutex_lock(&runtime->oss.params_lock); while (bytes > 0) { + if (mutex_lock_interruptible(&runtime->oss.params_lock)) { + tmp = -ERESTARTSYS; + break; + } if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) { tmp = bytes; if (tmp + runtime->oss.buffer_used > runtime->oss.period_bytes) @@ -1416,14 +1418,18 @@ static ssize_t snd_pcm_oss_write1(struct snd_pcm_substream *substream, const cha xfer += tmp; if ((substream->f_flags & O_NONBLOCK) != 0 && tmp != runtime->oss.period_bytes) - break; + tmp = -EAGAIN; } - } - mutex_unlock(&runtime->oss.params_lock); - return xfer; - err: - mutex_unlock(&runtime->oss.params_lock); + mutex_unlock(&runtime->oss.params_lock); + if (tmp < 0) + break; + if (signal_pending(current)) { + tmp = -ERESTARTSYS; + break; + } + tmp = 0; + } return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp; } @@ -1471,8 +1477,11 @@ static ssize_t snd_pcm_oss_read1(struct snd_pcm_substream *substream, char __use if ((tmp = snd_pcm_oss_make_ready(substream)) < 0) return tmp; - mutex_lock(&runtime->oss.params_lock); while (bytes > 0) { + if (mutex_lock_interruptible(&runtime->oss.params_lock)) { + tmp = -ERESTARTSYS; + break; + } if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) { if (runtime->oss.buffer_used == 0) { tmp = snd_pcm_oss_read2(substream, runtime->oss.buffer, runtime->oss.period_bytes, 1); @@ -1503,12 +1512,16 @@ static ssize_t snd_pcm_oss_read1(struct snd_pcm_substream *substream, char __use bytes -= tmp; xfer += tmp; } - } - mutex_unlock(&runtime->oss.params_lock); - return xfer; - err: - mutex_unlock(&runtime->oss.params_lock); + mutex_unlock(&runtime->oss.params_lock); + if (tmp < 0) + break; + if (signal_pending(current)) { + tmp = -ERESTARTSYS; + break; + } + tmp = 0; + } return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp; } diff --git a/sound/core/oss/pcm_plugin.c b/sound/core/oss/pcm_plugin.c index 727ac44d39f4..a84a1d3d23e5 100644 --- a/sound/core/oss/pcm_plugin.c +++ b/sound/core/oss/pcm_plugin.c @@ -591,18 +591,26 @@ snd_pcm_sframes_t snd_pcm_plug_write_transfer(struct snd_pcm_substream *plug, st snd_pcm_sframes_t frames = size; plugin = snd_pcm_plug_first(plug); - while (plugin && frames > 0) { + while (plugin) { + if (frames <= 0) + return frames; if ((next = plugin->next) != NULL) { snd_pcm_sframes_t frames1 = frames; - if (plugin->dst_frames) + if (plugin->dst_frames) { frames1 = plugin->dst_frames(plugin, frames); + if (frames1 <= 0) + return frames1; + } if ((err = next->client_channels(next, frames1, &dst_channels)) < 0) { return err; } if (err != frames1) { frames = err; - if (plugin->src_frames) + if (plugin->src_frames) { frames = plugin->src_frames(plugin, frames1); + if (frames <= 0) + return frames; + } } } else dst_channels = NULL; diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index 693ab89cc9a2..23e31ae9623f 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -1633,7 +1633,7 @@ int snd_pcm_hw_param_first(struct snd_pcm_substream *pcm, return changed; if (params->rmask) { int err = snd_pcm_hw_refine(pcm, params); - if (snd_BUG_ON(err < 0)) + if (err < 0) return err; } return snd_pcm_hw_param_value(params, var, dir); @@ -1680,7 +1680,7 @@ int snd_pcm_hw_param_last(struct snd_pcm_substream *pcm, return changed; if (params->rmask) { int err = snd_pcm_hw_refine(pcm, params); - if (snd_BUG_ON(err < 0)) + if (err < 0) return err; } return snd_pcm_hw_param_value(params, var, dir); diff --git a/sound/drivers/aloop.c b/sound/drivers/aloop.c index 2a16c86a60b3..61a3160af532 100644 --- a/sound/drivers/aloop.c +++ b/sound/drivers/aloop.c @@ -39,6 +39,7 @@ #include <sound/core.h> #include <sound/control.h> #include <sound/pcm.h> +#include <sound/pcm_params.h> #include <sound/info.h> #include <sound/initval.h> @@ -306,19 +307,6 @@ static int loopback_trigger(struct snd_pcm_substream *substream, int cmd) return 0; } -static void params_change_substream(struct loopback_pcm *dpcm, - struct snd_pcm_runtime *runtime) -{ - struct snd_pcm_runtime *dst_runtime; - - if (dpcm == NULL || dpcm->substream == NULL) - return; - dst_runtime = dpcm->substream->runtime; - if (dst_runtime == NULL) - return; - dst_runtime->hw = dpcm->cable->hw; -} - static void params_change(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; @@ -330,10 +318,6 @@ static void params_change(struct snd_pcm_substream *substream) cable->hw.rate_max = runtime->rate; cable->hw.channels_min = runtime->channels; cable->hw.channels_max = runtime->channels; - params_change_substream(cable->streams[SNDRV_PCM_STREAM_PLAYBACK], - runtime); - params_change_substream(cable->streams[SNDRV_PCM_STREAM_CAPTURE], - runtime); } static int loopback_prepare(struct snd_pcm_substream *substream) @@ -621,26 +605,29 @@ static unsigned int get_cable_index(struct snd_pcm_substream *substream) static int rule_format(struct snd_pcm_hw_params *params, struct snd_pcm_hw_rule *rule) { + struct loopback_pcm *dpcm = rule->private; + struct loopback_cable *cable = dpcm->cable; + struct snd_mask m; - struct snd_pcm_hardware *hw = rule->private; - struct snd_mask *maskp = hw_param_mask(params, rule->var); - - maskp->bits[0] &= (u_int32_t)hw->formats; - maskp->bits[1] &= (u_int32_t)(hw->formats >> 32); - memset(maskp->bits + 2, 0, (SNDRV_MASK_MAX-64) / 8); /* clear rest */ - if (! maskp->bits[0] && ! maskp->bits[1]) - return -EINVAL; - return 0; + snd_mask_none(&m); + mutex_lock(&dpcm->loopback->cable_lock); + m.bits[0] = (u_int32_t)cable->hw.formats; + m.bits[1] = (u_int32_t)(cable->hw.formats >> 32); + mutex_unlock(&dpcm->loopback->cable_lock); + return snd_mask_refine(hw_param_mask(params, rule->var), &m); } static int rule_rate(struct snd_pcm_hw_params *params, struct snd_pcm_hw_rule *rule) { - struct snd_pcm_hardware *hw = rule->private; + struct loopback_pcm *dpcm = rule->private; + struct loopback_cable *cable = dpcm->cable; struct snd_interval t; - t.min = hw->rate_min; - t.max = hw->rate_max; + mutex_lock(&dpcm->loopback->cable_lock); + t.min = cable->hw.rate_min; + t.max = cable->hw.rate_max; + mutex_unlock(&dpcm->loopback->cable_lock); t.openmin = t.openmax = 0; t.integer = 0; return snd_interval_refine(hw_param_interval(params, rule->var), &t); @@ -649,22 +636,44 @@ static int rule_rate(struct snd_pcm_hw_params *params, static int rule_channels(struct snd_pcm_hw_params *params, struct snd_pcm_hw_rule *rule) { - struct snd_pcm_hardware *hw = rule->private; + struct loopback_pcm *dpcm = rule->private; + struct loopback_cable *cable = dpcm->cable; struct snd_interval t; - t.min = hw->channels_min; - t.max = hw->channels_max; + mutex_lock(&dpcm->loopback->cable_lock); + t.min = cable->hw.channels_min; + t.max = cable->hw.channels_max; + mutex_unlock(&dpcm->loopback->cable_lock); t.openmin = t.openmax = 0; t.integer = 0; return snd_interval_refine(hw_param_interval(params, rule->var), &t); } +static void free_cable(struct snd_pcm_substream *substream) +{ + struct loopback *loopback = substream->private_data; + int dev = get_cable_index(substream); + struct loopback_cable *cable; + + cable = loopback->cables[substream->number][dev]; + if (!cable) + return; + if (cable->streams[!substream->stream]) { + /* other stream is still alive */ + cable->streams[substream->stream] = NULL; + } else { + /* free the cable */ + loopback->cables[substream->number][dev] = NULL; + kfree(cable); + } +} + static int loopback_open(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct loopback *loopback = substream->private_data; struct loopback_pcm *dpcm; - struct loopback_cable *cable; + struct loopback_cable *cable = NULL; int err = 0; int dev = get_cable_index(substream); @@ -683,7 +692,6 @@ static int loopback_open(struct snd_pcm_substream *substream) if (!cable) { cable = kzalloc(sizeof(*cable), GFP_KERNEL); if (!cable) { - kfree(dpcm); err = -ENOMEM; goto unlock; } @@ -701,19 +709,19 @@ static int loopback_open(struct snd_pcm_substream *substream) /* are cached -> they do not reflect the actual state */ err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_FORMAT, - rule_format, &runtime->hw, + rule_format, dpcm, SNDRV_PCM_HW_PARAM_FORMAT, -1); if (err < 0) goto unlock; err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_RATE, - rule_rate, &runtime->hw, + rule_rate, dpcm, SNDRV_PCM_HW_PARAM_RATE, -1); if (err < 0) goto unlock; err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS, - rule_channels, &runtime->hw, + rule_channels, dpcm, SNDRV_PCM_HW_PARAM_CHANNELS, -1); if (err < 0) goto unlock; @@ -725,6 +733,10 @@ static int loopback_open(struct snd_pcm_substream *substream) else runtime->hw = cable->hw; unlock: + if (err < 0) { + free_cable(substream); + kfree(dpcm); + } mutex_unlock(&loopback->cable_lock); return err; } @@ -733,20 +745,10 @@ static int loopback_close(struct snd_pcm_substream *substream) { struct loopback *loopback = substream->private_data; struct loopback_pcm *dpcm = substream->runtime->private_data; - struct loopback_cable *cable; - int dev = get_cable_index(substream); loopback_timer_stop(dpcm); mutex_lock(&loopback->cable_lock); - cable = loopback->cables[substream->number][dev]; - if (cable->streams[!substream->stream]) { - /* other stream is still alive */ - cable->streams[substream->stream] = NULL; - } else { - /* free the cable */ - loopback->cables[substream->number][dev] = NULL; - kfree(cable); - } + free_cable(substream); mutex_unlock(&loopback->cable_lock); return 0; }

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH] iwlwifi: fix access to prph when transport is stopped

by Emmanuel Grumbach

From: Sara Sharon <sara.sharon(a)intel.com> commit 0232d2cd7aa8e1b810fe84fb4059a0bd1eabe2ba upstream. When getting HW rfkill we get stop_device being called from two paths. One path is the IRQ calling stop device, and updating op mode and stack. As a result, cfg80211 is running rfkill sync work that shuts down all devices (second path). In the second path, we eventually get to iwl_mvm_stop_device which calls iwl_fw_dump_conf_clear->iwl_fw_dbg_stop_recording, that access periphery registers. The device may be stopped at this point from the first path, which will result with a failure to access those registers. Simply checking for the trans status is insufficient, since the race will still exist, only minimized. Instead, move the stop from iwl_fw_dump_conf_clear (which is getting called only from stop path) to the transport stop device function, where the access is always safe. This has the added value, of actually stopping dbgc before stopping device even when the stop is initiated from the transport. Fixes: 1efc3843a4ee ("iwlwifi: stop dbgc recording before stopping DMA") Signed-off-by: Sara Sharon <sara.sharon(a)intel.com> Signed-off-by: Luca Coelho <luciano.coelho(a)intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach(a)intel.com> --- drivers/net/wireless/intel/iwlwifi/fw/dbg.h | 2 -- drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c | 6 ++++++ drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 9 +++++++++ 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/intel/iwlwifi/fw/dbg.h b/drivers/net/wireless/intel/iwlwifi/fw/dbg.h index 9c889a32fe24..223fb77a3aa9 100644 --- a/drivers/net/wireless/intel/iwlwifi/fw/dbg.h +++ b/drivers/net/wireless/intel/iwlwifi/fw/dbg.h @@ -209,8 +209,6 @@ static inline void iwl_fw_dbg_stop_recording(struct iwl_fw_runtime *fwrt) static inline void iwl_fw_dump_conf_clear(struct iwl_fw_runtime *fwrt) { - iwl_fw_dbg_stop_recording(fwrt); - fwrt->dump.conf = FW_DBG_INVALID; } diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c b/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c index c59f4581e972..ac05fd1e74c4 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c @@ -49,6 +49,7 @@ * *****************************************************************************/ #include "iwl-trans.h" +#include "iwl-prph.h" #include "iwl-context-info.h" #include "internal.h" @@ -156,6 +157,11 @@ void _iwl_trans_pcie_gen2_stop_device(struct iwl_trans *trans, bool low_power) trans_pcie->is_down = true; + /* Stop dbgc before stopping device */ + iwl_write_prph(trans, DBGC_IN_SAMPLE, 0); + udelay(100); + iwl_write_prph(trans, DBGC_OUT_CTRL, 0); + /* tell the device to stop sending interrupts */ iwl_disable_interrupts(trans); diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c index 2e3e013ec95a..12a9b86d71ea 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c @@ -1138,6 +1138,15 @@ static void _iwl_trans_pcie_stop_device(struct iwl_trans *trans, bool low_power) trans_pcie->is_down = true; + /* Stop dbgc before stopping device */ + if (trans->cfg->device_family == IWL_DEVICE_FAMILY_7000) { + iwl_set_bits_prph(trans, MON_BUFF_SAMPLE_CTL, 0x100); + } else { + iwl_write_prph(trans, DBGC_IN_SAMPLE, 0); + udelay(100); + iwl_write_prph(trans, DBGC_OUT_CTRL, 0); + } + /* tell the device to stop sending interrupts */ iwl_disable_interrupts(trans); -- 2.14.3

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] KVM: Fix stack-out-of-bounds read in write_mmio

by Your Real Name

Commit: e39d200fa5bf5b94a0948db0dae44c1b73b84a56 Target Stable Tree Branch: 4.9.y Why this patch is needed: Due to a request to handle CVE-2017-17741, we would need to backport this patch to our kernel. The patch is already in mainline:https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git… Could you please include this patch in 4.9.y stable tree branch? --AH

7 years, 8 months

2
1
0 0

[Linux-stable-mirror] [PATCH] drm/fb-helper: Fix of-by-one in setcmap_pseudo_palette

by Daniel Vetter

[Fair warning: This is pure conjecture right now.] In commit b8e2b0199cc377617dc238f5106352c06dcd3fa2 Author: Peter Rosin <peda(a)axentia.se> Date: Tue Jul 4 12:36:57 2017 +0200 drm/fb-helper: factor out pseudo-palette Peter extracted the pseudo palette computation, but seems to have done an off-by-one. I spotted that +1, but then noticed that we've passed start++ to (now gone) setcolreg function, so it seemed to all match up. Except post vs. pre-increment ftw. Result is that the palette is off-by-one, and the forground color (slot 0) ends up black, rendering a fairly unreadable console. What baffles me is that on some systems it still seems to work somehow, which lead us all down a wild goose chase trying to add load_lut calls back in in various places (which was also intentionally removed, but really doesn't seem to be the real root cause). Fixes: b8e2b0199cc3 ("drm/fb-helper: factor out pseudo-palette") Cc: Peter Rosin <peda(a)axenita.se> Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch> Cc: Daniel Vetter <daniel.vetter(a)intel.com> Cc: Gustavo Padovan <gustavo(a)padovan.org> Cc: Sean Paul <seanpaul(a)chromium.org> Cc: David Airlie <airlied(a)linux.ie> Cc: dri-devel(a)lists.freedesktop.org Cc: <stable(a)vger.kernel.org> # v4.14+ Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=198123 Reported-by: Deposite Pirate <dpirate(a)metalpunks.info> Reported-by: Bill Fraser <bill.fraser(a)gmail.com> Cc: Deposite Pirate <dpirate(a)metalpunks.info> Cc: Bill Fraser <bill.fraser(a)gmail.com> Cc: Michel Dänzer <michel(a)daenzer.net> Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com> --- drivers/gpu/drm/drm_fb_helper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 035784ddd133..1c3a200c4a10 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -1295,7 +1295,7 @@ static int setcmap_pseudo_palette(struct fb_cmap *cmap, struct fb_info *info) mask <<= info->var.transp.offset; value |= mask; } - palette[cmap->start + i] = value; + palette[cmap->start] = value; } return 0; -- 2.15.1

7 years, 8 months

2
5
0 0

[Linux-stable-mirror] [PATCH 4.14 000/118] 4.14.14-stable review

by Greg Kroah-Hartman

This is the start of the stable review cycle for the 4.14.14 release. There are 118 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Wed Jan 17 12:33:32 UTC 2018. Anything received after that time might be too late. The whole patch series can be found in one patch at: kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.14-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below. thanks, greg k-h ------------- Pseudo-Shortlog of commits: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Linux 4.14.14-rc1 Thomas Gleixner <tglx(a)linutronix.de> x86/retpoline: Remove compile time warning Peter Zijlstra <peterz(a)infradead.org> x86,perf: Disable intel_bts when PTI W. Trevor King <wking(a)tremily.us> security/Kconfig: Correct the Documentation reference for PTI Thomas Gleixner <tglx(a)linutronix.de> x86/pti: Fix !PCID and sanitize defines Andy Lutomirski <luto(a)kernel.org> selftests/x86: Add test_vsyscall David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline: Fill return stack buffer on vmexit Andi Kleen <ak(a)linux.intel.com> x86/retpoline/irq32: Convert assembler indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline/checksum32: Convert assembler indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline/xen: Convert Xen hypercall indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline/hyperv: Convert assembler indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline/ftrace: Convert ftrace assembler indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline/entry: Convert entry assembler indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline/crypto: Convert crypto assembler indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/spectre: Add boot time option to select Spectre v2 mitigation David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline: Add initial retpoline support Josh Poimboeuf <jpoimboe(a)redhat.com> objtool: Allow alternatives to be ignored Josh Poimboeuf <jpoimboe(a)redhat.com> objtool: Detect jumps to retpoline thunks Dave Hansen <dave.hansen(a)linux.intel.com> x86/pti: Make unpoison of pgd for trusted boot work for real Borislav Petkov <bp(a)suse.de> x86/alternatives: Fix optimize_nops() checking David Woodhouse <dwmw(a)amazon.co.uk> sysfs/cpu: Fix typos in vulnerability documentation Tom Lendacky <thomas.lendacky(a)amd.com> x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC Tom Lendacky <thomas.lendacky(a)amd.com> x86/cpu/AMD: Make LFENCE a serializing instruction Jike Song <albcamus(a)gmail.com> x86/mm/pti: Remove dead logic in pti_user_pagetable_walk*() Dave Hansen <dave.hansen(a)linux.intel.com> x86/tboot: Unbreak tboot with PTI enabled Thomas Gleixner <tglx(a)linutronix.de> x86/cpu: Implement CPU vulnerabilites sysfs functions Thomas Gleixner <tglx(a)linutronix.de> sysfs/cpu: Add vulnerability folder David Woodhouse <dwmw(a)amazon.co.uk> x86/cpufeatures: Add X86_BUG_SPECTRE_V[12] Dave Hansen <dave.hansen(a)linux.intel.com> x86/Documentation: Add PTI description Jiri Kosina <jkosina(a)suse.cz> x86/pti: Unbreak EFI old_memmap Benjamin Poirier <bpoirier(a)suse.com> e1000e: Fix e1000_check_for_copper_link_ich8lan return value. John Johansen <john.johansen(a)canonical.com> apparmor: fix ptrace label match when matching stacked labels Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com> kdump: write correct address of mem_section into vmcoreinfo Hans de Goede <hdegoede(a)redhat.com> mux: core: fix double get_device() Icenowy Zheng <icenowy(a)aosc.io> uas: ignore UAS for Norelsys NS1068(X) chips Ben Seri <ben(a)armis.com> Bluetooth: Prevent stack info leak from the EFS element. Viktor Slavkovic <viktors(a)google.com> staging: android: ashmem: fix a race condition in ASHMEM_SET_SIZE ioctl Shuah Khan <shuah(a)kernel.org> usbip: vudc_tx: fix v_send_ret_submit() vulnerability to null xfer buffer Shuah Khan <shuah(a)kernel.org> usbip: fix vudc_rx: harden CMD_SUBMIT path to handle malicious input Shuah Khan <shuah(a)kernel.org> usbip: remove kernel addresses from usb device and urb debug msgs Alan Stern <stern(a)rowland.harvard.edu> USB: UDC core: fix double-free in usb_add_gadget_udc_release Pete Zaitcev <zaitcev(a)redhat.com> USB: fix usbmon BUG trigger Stefan Agner <stefan(a)agner.ch> usb: misc: usb3503: make sure reset is low for at least 100us Christian Holl <cyborgx1(a)gmail.com> USB: serial: cp210x: add new device ID ELV ALC 8xxx Diego Elio Pettenò <flameeyes(a)flameeyes.eu> USB: serial: cp210x: add IDs for LifeScan OneTouch Verio IQ Daniel Borkmann <daniel(a)iogearbox.net> bpf: arsh is not supported in 32 bit alu thus reject it Daniel Borkmann <daniel(a)iogearbox.net> bpf, array: fix overflow in max_entries and undefined behavior in index_mask Alexei Starovoitov <ast(a)kernel.org> bpf: prevent out-of-bounds speculation Ville Syrjälä <ville.syrjala(a)linux.intel.com> drm/i915: Fix init_clock_gating for resume Ville Syrjälä <ville.syrjala(a)linux.intel.com> drm/i915: Move init_clock_gating() back to where it was Kenneth Graunke <kenneth(a)whitecape.org> drm/i915: Whitelist SLICE_COMMON_ECO_CHICKEN1 on Geminilake. Zhi Wang <zhi.a.wang(a)intel.com> drm/i915/gvt: Clear the shadow page table entry after post-sync Dan Carpenter <dan.carpenter(a)oracle.com> drm/vmwgfx: Potential off by one in vmw_view_add() Thomas Hellstrom <thellstrom(a)vmware.com> drm/vmwgfx: Don't cache framebuffer maps David Gibson <david(a)gibson.dropbear.id.au> KVM: PPC: Book3S HV: Always flush TLB in kvmppc_alloc_reset_hpt() Serhii Popovych <spopovyc(a)redhat.com> KVM: PPC: Book3S HV: Fix use after free in case of multiple resize requests Serhii Popovych <spopovyc(a)redhat.com> KVM: PPC: Book3S HV: Drop prepare_done from struct kvm_resize_hpt Alexey Kardashevskiy <aik(a)ozlabs.ru> KVM: PPC: Book3S PR: Fix WIMG handling under pHyp Andrew Honig <ahonig(a)google.com> KVM: x86: Add memory barrier on vmcs field lookup Jia Zhang <qianyue.zj(a)alibaba-inc.com> x86/microcode/intel: Extend BDW late-loading with a revision check Emmanuel Grumbach <emmanuel.grumbach(a)intel.com> iwlwifi: pcie: fix DMA memory mapping / unmapping Ilya Dryomov <idryomov(a)gmail.com> rbd: set max_segments to USHRT_MAX Florian Margaine <florian(a)platform.sh> rbd: reacquire lock should update lock owner client id Masaharu Hayakawa <masaharu.hayakawa.ry(a)renesas.com> mmc: renesas_sdhi: Add MODULE_LICENSE Eric Biggers <ebiggers(a)google.com> crypto: algapi - fix NULL dereference in crypto_remove_spawns() Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> membarrier: Disable preemption when calling smp_call_function_many() David S. Miller <davem(a)davemloft.net> Revert "Revert "xfrm: Fix stack-out-of-bounds read in xfrm_state_find."" Russell King <rmk+kernel(a)armlinux.org.uk> sfp: fix sfp-bus oops when removing socket/upstream Ido Schimmel <idosch(a)mellanox.com> mlxsw: spectrum: Relax sanity checks during enslavement Mathieu Xhonneux <m.xhonneux(a)gmail.com> ipv6: sr: fix TLVs not being copied using setsockopt Roi Dayan <roid(a)mellanox.com> net/sched: Fix update of lastuse in act modules implementing stats_update Ido Schimmel <idosch(a)mellanox.com> mlxsw: spectrum_router: Fix NULL pointer deref Stephen Hemminger <stephen(a)networkplumber.org> ethtool: do not print warning for applications using legacy API Eric Dumazet <edumazet(a)google.com> ipv6: fix possible mem leaks in ipv6_make_skb() Sergei Shtylyov <sergei.shtylyov(a)cogentembedded.com> sh_eth: fix SH7757 GEther initialization Jerome Brunet <jbrunet(a)baylibre.com> net: stmmac: enable EEE in MII, GMII or RGMII only Sergei Shtylyov <sergei.shtylyov(a)cogentembedded.com> sh_eth: fix TSU resource handling Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com> sctp: fix the handling of ICMP Frag Needed for too small MTUs Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com> sctp: do not retransmit upon FragNeeded if PMTU discovery is disabled Fugang Duan <fugang.duan(a)nxp.com> net: fec: free/restore resource in related probe error pathes Fugang Duan <fugang.duan(a)nxp.com> net: fec: defer probe if regulator is not ready Fugang Duan <fugang.duan(a)nxp.com> net: fec: restore dev_id in the cases of probe error Mohamed Ghannam <simo.ghannam(a)gmail.com> RDS: null pointer dereference in rds_atomic_free_op Mohamed Ghannam <simo.ghannam(a)gmail.com> RDS: Heap OOB write in rds_message_alloc_sgs() Russell King <rmk+kernel(a)armlinux.org.uk> phylink: ensure we report link down when LOS asserted Andrii Vladyka <tulup(a)mail.ru> net: core: fix module type in sock_diag_bind Eli Cooper <elicooper(a)gmx.com> ip6_tunnel: disable dst caching if tunnel is dual-stack Cong Wang <xiyou.wangcong(a)gmail.com> 8021q: fix a memory leak for VLAN 0 device Vikas C Sajjan <vikas.cha.sajjan(a)hpe.com> x86/acpi: Reduce code duplication in mp_override_legacy_irq() Takashi Iwai <tiwai(a)suse.de> ALSA: aloop: Fix racy hw constraints adjustment Takashi Iwai <tiwai(a)suse.de> ALSA: aloop: Fix inconsistent format due to incomplete rule Takashi Iwai <tiwai(a)suse.de> ALSA: aloop: Release cable upon open error path Takashi Iwai <tiwai(a)suse.de> ALSA: pcm: Allow aborting mutex lock at OSS read/write loops Takashi Iwai <tiwai(a)suse.de> ALSA: pcm: Abort properly at pending signal in OSS read/write loops Takashi Iwai <tiwai(a)suse.de> ALSA: pcm: Add missing error checks in OSS emulation plugin builder Takashi Iwai <tiwai(a)suse.de> ALSA: pcm: Workaround for weird PulseAudio behavior on rewind error Takashi Iwai <tiwai(a)suse.de> ALSA: pcm: Remove incorrect snd_BUG_ON() usages Vikas C Sajjan <vikas.cha.sajjan(a)hpe.com> x86/acpi: Handle SCI interrupts above legacy space gracefully Steve Wise <swise(a)opengridcomputing.com> iw_cxgb4: when flushing, complete all wrs in a chain Steve Wise <swise(a)opengridcomputing.com> iw_cxgb4: reflect the original WR opcode in drain cqes Steve Wise <swise(a)opengridcomputing.com> iw_cxgb4: only clear the ARMED bit if a notification is needed Steve Wise <swise(a)opengridcomputing.com> iw_cxgb4: atomically flush the qp Steve Wise <swise(a)opengridcomputing.com> iw_cxgb4: only call the cq comp_handler when the cq is armed Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> platform/x86: wmi: Call acpi_wmi_init() later Jim Mattson <jmattson(a)google.com> kvm: vmx: Scrub hardware GPRs at VM-exit Tejun Heo <tj(a)kernel.org> cgroup: fix css_task_iter crash on CSS_TASK_ITER_PROC Maciej W. Rozycki <macro(a)mips.com> MIPS: Disallow outsized PTRACE_SETREGSET NT_PRFPREG regset accesses Maciej W. Rozycki <macro(a)mips.com> MIPS: Also verify sizeof `elf_fpreg_t' with PTRACE_SETREGSET Maciej W. Rozycki <macro(a)mips.com> MIPS: Fix an FCSR access API regression with NT_PRFPREG and MSA Maciej W. Rozycki <macro(a)mips.com> MIPS: Consistently handle buffer counter with PTRACE_SETREGSET Maciej W. Rozycki <macro(a)mips.com> MIPS: Guard against any partial write attempt with PTRACE_SETREGSET Maciej W. Rozycki <macro(a)mips.com> MIPS: Factor out NT_PRFPREG regset access helpers Maciej W. Rozycki <macro(a)mips.com> MIPS: Validate PR_SET_FP_MODE prctl(2) requests against the ABI of the task Bart Van Assche <bart.vanassche(a)wdc.com> IB/srpt: Fix ACL lookup during login Bart Van Assche <bart.vanassche(a)wdc.com> IB/srpt: Disable RDMA access by the initiator Wolfgang Grandegger <wg(a)grandegger.com> can: gs_usb: fix return value of the "set_bittiming" callback Oliver Hartkopp <socketcan(a)hartkopp.net> can: vxcan: improve handling of missing peer name attribute Wanpeng Li <wanpeng.li(a)hotmail.com> KVM: Fix stack-out-of-bounds read in write_mmio Suren Baghdasaryan <surenb(a)google.com> dm bufio: fix shrinker scans when (nr_to_scan < retain_target) ------------- Diffstat: Documentation/ABI/testing/sysfs-devices-system-cpu | 16 + Documentation/admin-guide/kernel-parameters.txt | 49 +- Documentation/x86/pti.txt | 186 ++++++++ Makefile | 4 +- arch/mips/kernel/process.c | 12 + arch/mips/kernel/ptrace.c | 147 ++++-- arch/powerpc/kvm/book3s_64_mmu.c | 1 + arch/powerpc/kvm/book3s_64_mmu_hv.c | 90 ++-- arch/powerpc/kvm/book3s_pr.c | 2 + arch/x86/Kconfig | 14 + arch/x86/Makefile | 8 + arch/x86/crypto/aesni-intel_asm.S | 5 +- arch/x86/crypto/camellia-aesni-avx-asm_64.S | 3 +- arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 3 +- arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 3 +- arch/x86/entry/calling.h | 36 +- arch/x86/entry/entry_32.S | 5 +- arch/x86/entry/entry_64.S | 12 +- arch/x86/events/intel/bts.c | 18 + arch/x86/include/asm/asm-prototypes.h | 25 ++ arch/x86/include/asm/cpufeatures.h | 4 + arch/x86/include/asm/mshyperv.h | 18 +- arch/x86/include/asm/msr-index.h | 3 + arch/x86/include/asm/nospec-branch.h | 214 +++++++++ arch/x86/include/asm/processor-flags.h | 2 +- arch/x86/include/asm/tlbflush.h | 6 +- arch/x86/include/asm/xen/hypercall.h | 5 +- arch/x86/kernel/acpi/boot.c | 61 ++- arch/x86/kernel/alternative.c | 7 +- arch/x86/kernel/cpu/amd.c | 28 +- arch/x86/kernel/cpu/bugs.c | 185 ++++++++ arch/x86/kernel/cpu/common.c | 3 + arch/x86/kernel/cpu/microcode/intel.c | 13 +- arch/x86/kernel/ftrace_32.S | 6 +- arch/x86/kernel/ftrace_64.S | 8 +- arch/x86/kernel/irq_32.c | 9 +- arch/x86/kernel/tboot.c | 11 + arch/x86/kvm/svm.c | 23 + arch/x86/kvm/vmx.c | 30 +- arch/x86/kvm/x86.c | 8 +- arch/x86/lib/Makefile | 1 + arch/x86/lib/checksum_32.S | 7 +- arch/x86/lib/retpoline.S | 48 ++ arch/x86/mm/pti.c | 32 +- arch/x86/platform/efi/efi_64.c | 2 + crypto/algapi.c | 12 + drivers/base/Kconfig | 3 + drivers/base/cpu.c | 48 ++ drivers/block/rbd.c | 18 +- drivers/gpu/drm/i915/gvt/gtt.c | 5 +- drivers/gpu/drm/i915/i915_drv.c | 1 + drivers/gpu/drm/i915/i915_reg.h | 2 + drivers/gpu/drm/i915/intel_display.c | 14 +- drivers/gpu/drm/i915/intel_engine_cs.c | 5 + drivers/gpu/drm/i915/intel_pm.c | 44 +- drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 2 + drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 6 - drivers/gpu/drm/vmwgfx/vmwgfx_kms.h | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c | 41 +- drivers/infiniband/hw/cxgb4/cq.c | 7 +- drivers/infiniband/hw/cxgb4/ev.c | 8 +- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 2 - drivers/infiniband/hw/cxgb4/qp.c | 119 +++-- drivers/infiniband/hw/cxgb4/t4.h | 6 + drivers/infiniband/ulp/srpt/ib_srpt.c | 5 +- drivers/md/dm-bufio.c | 8 +- drivers/mmc/host/renesas_sdhi_core.c | 3 + drivers/mux/core.c | 4 +- drivers/net/can/usb/gs_usb.c | 2 +- drivers/net/can/vxcan.c | 2 +- drivers/net/ethernet/freescale/fec_main.c | 7 +- drivers/net/ethernet/intel/e1000e/ich8lan.c | 11 +- drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 11 +- drivers/net/ethernet/mellanox/mlxsw/spectrum.h | 2 + .../net/ethernet/mellanox/mlxsw/spectrum_router.c | 2 +- .../ethernet/mellanox/mlxsw/spectrum_switchdev.c | 6 + drivers/net/ethernet/renesas/sh_eth.c | 29 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 + drivers/net/phy/phylink.c | 3 +- drivers/net/phy/sfp-bus.c | 6 +- drivers/net/wireless/intel/iwlwifi/pcie/internal.h | 10 +- drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c | 11 +- drivers/net/wireless/intel/iwlwifi/pcie/tx.c | 8 +- drivers/platform/x86/wmi.c | 2 +- drivers/staging/android/ashmem.c | 2 + drivers/usb/gadget/udc/core.c | 28 +- drivers/usb/misc/usb3503.c | 2 + drivers/usb/mon/mon_bin.c | 8 +- drivers/usb/serial/cp210x.c | 2 + drivers/usb/storage/unusual_uas.h | 7 + drivers/usb/usbip/usbip_common.c | 17 +- drivers/usb/usbip/vudc_rx.c | 19 + drivers/usb/usbip/vudc_tx.c | 11 +- include/linux/bpf.h | 2 + include/linux/cpu.h | 7 + include/linux/crash_core.h | 2 + include/linux/sh_eth.h | 1 - include/net/sctp/structs.h | 2 +- include/trace/events/kvm.h | 7 +- kernel/bpf/arraymap.c | 61 ++- kernel/bpf/verifier.c | 41 ++ kernel/cgroup/cgroup.c | 14 +- kernel/crash_core.c | 2 +- kernel/sched/membarrier.c | 2 + net/8021q/vlan.c | 7 +- net/bluetooth/l2cap_core.c | 20 +- net/core/ethtool.c | 15 +- net/core/sock_diag.c | 2 +- net/ipv6/exthdrs.c | 9 + net/ipv6/ip6_output.c | 5 +- net/ipv6/ip6_tunnel.c | 9 +- net/rds/rdma.c | 4 + net/sched/act_gact.c | 2 +- net/sched/act_mirred.c | 2 +- net/sctp/input.c | 28 +- net/sctp/transport.c | 29 +- net/xfrm/xfrm_policy.c | 29 +- security/Kconfig | 2 +- security/apparmor/include/perms.h | 3 + security/apparmor/ipc.c | 53 ++- sound/core/oss/pcm_oss.c | 41 +- sound/core/oss/pcm_plugin.c | 14 +- sound/core/pcm_lib.c | 4 +- sound/core/pcm_native.c | 9 +- sound/drivers/aloop.c | 98 ++-- tools/objtool/check.c | 69 ++- tools/objtool/check.h | 2 +- tools/testing/selftests/bpf/test_verifier.c | 40 ++ tools/testing/selftests/x86/Makefile | 2 +- tools/testing/selftests/x86/test_vsyscall.c | 500 +++++++++++++++++++++ virt/kvm/arm/mmio.c | 6 +- 131 files changed, 2536 insertions(+), 561 deletions(-)

7 years, 8 months

11
132
0 0

[Linux-stable-mirror] + proc-fix-coredump-vs-read-proc-stat-race.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: proc: fix coredump vs read /proc/*/stat race has been added to the -mm tree. Its filename is proc-fix-coredump-vs-read-proc-stat-race.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/proc-fix-coredump-vs-read-proc-sta… and later at http://ozlabs.org/~akpm/mmotm/broken-out/proc-fix-coredump-vs-read-proc-sta… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Alexey Dobriyan <adobriyan(a)gmail.com> Subject: proc: fix coredump vs read /proc/*/stat race do_task_stat() accesses IP and SP of a task without bumping reference count of a stack (which became an entity with independent lifetime at some point). Steps to reproduce: #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/time.h> #include <sys/resource.h> #include <unistd.h> #include <sys/wait.h> int main(void) { setrlimit(RLIMIT_CORE, &(struct rlimit){}); while (1) { char buf[64]; char buf2[4096]; pid_t pid; int fd; pid = fork(); if (pid == 0) { *(volatile int *)0 = 0; } snprintf(buf, sizeof(buf), "/proc/%u/stat", pid); fd = open(buf, O_RDONLY); read(fd, buf2, sizeof(buf2)); close(fd); waitpid(pid, NULL, 0); } return 0; } BUG: unable to handle kernel paging request at 0000000000003fd8 IP: do_task_stat+0x8b4/0xaf0 PGD 800000003d73e067 P4D 800000003d73e067 PUD 3d558067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 1417 Comm: a.out Not tainted 4.15.0-rc8-dirty #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc27 04/01/2014 RIP: 0010:do_task_stat+0x8b4/0xaf0 RSP: 0018:ffffc90000607cc8 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff88003e0e6680 RCX: ffff88003d6bd500 RDX: 0000000000000001 RSI: 000000000000000d RDI: ffff88003c1f8000 RBP: ffff88003c1f8000 R08: 0000000000000001 R09: 0000000000001000 R10: 00007f4d79555460 R11: ffff88003c325800 R12: ffff88003d7643c0 R13: ffffffff81a28f80 R14: 0000000000000001 R15: 0000000000000001 FS: 00007f4d79772700(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000003fd8 CR3: 000000003d9c0000 CR4: 00000000000006b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? flush_tlb_mm_range+0x8b/0x100 proc_single_show+0x43/0x70 seq_read+0xe6/0x3b0 __vfs_read+0x1e/0x120 ? __check_object_size+0xa4/0x180 vfs_read+0x84/0x110 SyS_read+0x3d/0xa0 entry_SYSCALL_64_fastpath+0x13/0x6c RIP: 0033:0x7f4d7928cba0 RSP: 002b:00007ffddb245158 EFLAGS: 00000246 Code: 03 b7 a0 01 00 00 4c 8b 4c 24 70 4c 8b 44 24 78 4c 89 74 24 18 e9 91 f9 ff ff f6 45 4d 02 0f 84 fd f7 ff ff 48 8b 45 40 48 89 ef <48> 8b 80 d8 3f 00 00 48 89 44 24 20 e8 9b 97 eb ff 48 89 44 24 RIP: do_task_stat+0x8b4/0xaf0 RSP: ffffc90000607cc8 CR2: 0000000000003fd8 Link: http://lkml.kernel.org/r/20180116175054.GA11513@avx2 Signed-off-by: Alexey Dobriyan <adobriyan(a)gmail.com> Reported-by: "Kohli, Gaurav" <gkohli(a)codeaurora.org> Cc: John Ogness <john.ogness(a)linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra(a)chello.nl> Cc: Ingo Molnar <mingo(a)elte.hu> Cc: Oleg Nesterov <oleg(a)redhat.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/proc/array.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff -puN fs/proc/array.c~proc-fix-coredump-vs-read-proc-stat-race fs/proc/array.c --- a/fs/proc/array.c~proc-fix-coredump-vs-read-proc-stat-race +++ a/fs/proc/array.c @@ -430,8 +430,11 @@ static int do_task_stat(struct seq_file * safe because the task has stopped executing permanently. */ if (permitted && (task->flags & PF_DUMPCORE)) { - eip = KSTK_EIP(task); - esp = KSTK_ESP(task); + if (try_get_task_stack(task)) { + eip = KSTK_EIP(task); + esp = KSTK_ESP(task); + put_task_stack(task); + } } } _ Patches currently in -mm which might be from adobriyan(a)gmail.com are proc-fix-coredump-vs-read-proc-stat-race.patch proc-use-%u-for-pid-printing-and-slightly-less-stack.patch proc-dont-use-read_once-write_once-for-proc-fail-nth.patch proc-fix-proc-map_files-lookup.patch proc-simpler-proc-vmcore-cleanup.patch proc-less-memory-for-proc-map_files-readdir.patch proc-delete-children_seq_release.patch proc-rearrange-struct-proc_dir_entry.patch proc-fixup-comment.patch proc-spread-__ro_after_init.patch proc-spread-likely-unlikely-a-bit.patch proc-rearrange-args.patch uuid-cleanup-uapi-linux-uuidh.patch elf-fix-nt_file-integer-overflow.patch seq_file-delete-small-value-optimization.patch cpumask-make-cpumask_size-return-unsigned-int.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH 4.9 00/96] 4.9.77-stable review

by Greg Kroah-Hartman

This is the start of the stable review cycle for the 4.9.77 release. There are 96 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Wed Jan 17 12:33:26 UTC 2018. Anything received after that time might be too late. The whole patch series can be found in one patch at: kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.77-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below. thanks, greg k-h ------------- Pseudo-Shortlog of commits: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Linux 4.9.77-rc1 Thomas Gleixner <tglx(a)linutronix.de> x86/retpoline: Remove compile time warning Andy Lutomirski <luto(a)kernel.org> selftests/x86: Add test_vsyscall David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline: Fill return stack buffer on vmexit Andi Kleen <ak(a)linux.intel.com> x86/retpoline/irq32: Convert assembler indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline/checksum32: Convert assembler indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline/xen: Convert Xen hypercall indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline/hyperv: Convert assembler indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline/ftrace: Convert ftrace assembler indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline/entry: Convert entry assembler indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline/crypto: Convert crypto assembler indirect jumps David Woodhouse <dwmw(a)amazon.co.uk> x86/spectre: Add boot time option to select Spectre v2 mitigation David Woodhouse <dwmw(a)amazon.co.uk> x86/retpoline: Add initial retpoline support Andrey Ryabinin <aryabinin(a)virtuozzo.com> x86/asm: Use register variable to get stack pointer value Josh Poimboeuf <jpoimboe(a)redhat.com> objtool: Allow alternatives to be ignored Josh Poimboeuf <jpoimboe(a)redhat.com> objtool: Detect jumps to retpoline thunks Josh Poimboeuf <jpoimboe(a)redhat.com> objtool, modules: Discard objtool annotation sections for modules Andy Lutomirski <luto(a)kernel.org> x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier David Woodhouse <dwmw(a)amazon.co.uk> x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm Borislav Petkov <bp(a)suse.de> x86/alternatives: Fix optimize_nops() checking David Woodhouse <dwmw(a)amazon.co.uk> sysfs/cpu: Fix typos in vulnerability documentation Tom Lendacky <thomas.lendacky(a)amd.com> x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC Tom Lendacky <thomas.lendacky(a)amd.com> x86/cpu/AMD: Make LFENCE a serializing instruction Thomas Gleixner <tglx(a)linutronix.de> x86/cpu: Implement CPU vulnerabilites sysfs functions Thomas Gleixner <tglx(a)linutronix.de> sysfs/cpu: Add vulnerability folder Borislav Petkov <bp(a)suse.de> x86/cpu: Merge bugs.c and bugs_64.c David Woodhouse <dwmw(a)amazon.co.uk> x86/cpufeatures: Add X86_BUG_SPECTRE_V[12] Thomas Gleixner <tglx(a)linutronix.de> x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN Thomas Gleixner <tglx(a)linutronix.de> x86/cpufeatures: Add X86_BUG_CPU_INSECURE Thomas Gleixner <tglx(a)linutronix.de> x86/cpufeatures: Make CPU bugs sticky Andy Lutomirski <luto(a)kernel.org> x86/cpu: Factor out application of forced CPU caps Dave Hansen <dave.hansen(a)linux.intel.com> x86/Documentation: Add PTI description Benjamin Poirier <bpoirier(a)suse.com> e1000e: Fix e1000_check_for_copper_link_ich8lan return value. Icenowy Zheng <icenowy(a)aosc.io> uas: ignore UAS for Norelsys NS1068(X) chips Ben Seri <ben(a)armis.com> Bluetooth: Prevent stack info leak from the EFS element. Viktor Slavkovic <viktors(a)google.com> staging: android: ashmem: fix a race condition in ASHMEM_SET_SIZE ioctl Shuah Khan <shuah(a)kernel.org> usbip: vudc_tx: fix v_send_ret_submit() vulnerability to null xfer buffer Shuah Khan <shuah(a)kernel.org> usbip: fix vudc_rx: harden CMD_SUBMIT path to handle malicious input Shuah Khan <shuah(a)kernel.org> usbip: remove kernel addresses from usb device and urb debug msgs Pete Zaitcev <zaitcev(a)redhat.com> USB: fix usbmon BUG trigger Stefan Agner <stefan(a)agner.ch> usb: misc: usb3503: make sure reset is low for at least 100us Christian Holl <cyborgx1(a)gmail.com> USB: serial: cp210x: add new device ID ELV ALC 8xxx Diego Elio Pettenò <flameeyes(a)flameeyes.eu> USB: serial: cp210x: add IDs for LifeScan OneTouch Verio IQ Daniel Borkmann <daniel(a)iogearbox.net> bpf, array: fix overflow in max_entries and undefined behavior in index_mask Alexei Starovoitov <ast(a)kernel.org> bpf: prevent out-of-bounds speculation Alexei Starovoitov <ast(a)fb.com> bpf: refactor fixup_bpf_calls() Alexei Starovoitov <ast(a)fb.com> bpf: move fixup_bpf_calls() function Nicholas Bellinger <nab(a)linux-iscsi.org> target: Avoid early CMD_T_PRE_EXECUTE failures during ABORT_TASK Nicholas Bellinger <nab(a)linux-iscsi.org> iscsi-target: Make TASK_REASSIGN use proper se_cmd->cmd_kref Lepton Wu <ytht.net(a)gmail.com> kaiser: Set _PAGE_NX only if supported Dan Carpenter <dan.carpenter(a)oracle.com> drm/vmwgfx: Potential off by one in vmw_view_add() Andrew Honig <ahonig(a)google.com> KVM: x86: Add memory barrier on vmcs field lookup Jia Zhang <qianyue.zj(a)alibaba-inc.com> x86/microcode/intel: Extend BDW late-loading with a revision check Ilya Dryomov <idryomov(a)gmail.com> rbd: set max_segments to USHRT_MAX Eric Biggers <ebiggers(a)google.com> crypto: algapi - fix NULL dereference in crypto_remove_spawns() Roi Dayan <roid(a)mellanox.com> net/sched: Fix update of lastuse in act modules implementing stats_update Ido Schimmel <idosch(a)mellanox.com> mlxsw: spectrum_router: Fix NULL pointer deref Stephen Hemminger <stephen(a)networkplumber.org> ethtool: do not print warning for applications using legacy API Eric Dumazet <edumazet(a)google.com> ipv6: fix possible mem leaks in ipv6_make_skb() Jerome Brunet <jbrunet(a)baylibre.com> net: stmmac: enable EEE in MII, GMII or RGMII only Sergei Shtylyov <sergei.shtylyov(a)cogentembedded.com> sh_eth: fix SH7757 GEther initialization Sergei Shtylyov <sergei.shtylyov(a)cogentembedded.com> sh_eth: fix TSU resource handling Mohamed Ghannam <simo.ghannam(a)gmail.com> RDS: null pointer dereference in rds_atomic_free_op Mohamed Ghannam <simo.ghannam(a)gmail.com> RDS: Heap OOB write in rds_message_alloc_sgs() Andrii Vladyka <tulup(a)mail.ru> net: core: fix module type in sock_diag_bind Eli Cooper <elicooper(a)gmx.com> ip6_tunnel: disable dst caching if tunnel is dual-stack Cong Wang <xiyou.wangcong(a)gmail.com> 8021q: fix a memory leak for VLAN 0 device Ben Hutchings <ben.hutchings(a)codethink.co.uk> xhci: Fix ring leak in failure path of xhci_alloc_virt_device() Eric Dumazet <edumazet(a)google.com> cx82310_eth: use skb_cow_head() to deal with cloned skbs Eric Dumazet <edumazet(a)google.com> smsc75xx: use skb_cow_head() to deal with cloned skbs Eric Dumazet <edumazet(a)google.com> sr9700: use skb_cow_head() to deal with cloned skbs Eric Dumazet <edumazet(a)google.com> lan78xx: use skb_cow_head() to deal with cloned skbs Dan Streetman <ddstreet(a)ieee.org> zswap: don't param_set_charp while holding spinlock Vikas C Sajjan <vikas.cha.sajjan(a)hpe.com> x86/acpi: Reduce code duplication in mp_override_legacy_irq() Takashi Iwai <tiwai(a)suse.de> ALSA: aloop: Fix racy hw constraints adjustment Takashi Iwai <tiwai(a)suse.de> ALSA: aloop: Fix inconsistent format due to incomplete rule Takashi Iwai <tiwai(a)suse.de> ALSA: aloop: Release cable upon open error path Takashi Iwai <tiwai(a)suse.de> ALSA: pcm: Allow aborting mutex lock at OSS read/write loops Takashi Iwai <tiwai(a)suse.de> ALSA: pcm: Abort properly at pending signal in OSS read/write loops Takashi Iwai <tiwai(a)suse.de> ALSA: pcm: Add missing error checks in OSS emulation plugin builder Takashi Iwai <tiwai(a)suse.de> ALSA: pcm: Remove incorrect snd_BUG_ON() usages Vikas C Sajjan <vikas.cha.sajjan(a)hpe.com> x86/acpi: Handle SCI interrupts above legacy space gracefully Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> platform/x86: wmi: Call acpi_wmi_init() later Jim Mattson <jmattson(a)google.com> kvm: vmx: Scrub hardware GPRs at VM-exit Maciej W. Rozycki <macro(a)mips.com> MIPS: Disallow outsized PTRACE_SETREGSET NT_PRFPREG regset accesses Maciej W. Rozycki <macro(a)mips.com> MIPS: Also verify sizeof `elf_fpreg_t' with PTRACE_SETREGSET Maciej W. Rozycki <macro(a)mips.com> MIPS: Fix an FCSR access API regression with NT_PRFPREG and MSA Maciej W. Rozycki <macro(a)mips.com> MIPS: Consistently handle buffer counter with PTRACE_SETREGSET Maciej W. Rozycki <macro(a)mips.com> MIPS: Guard against any partial write attempt with PTRACE_SETREGSET Maciej W. Rozycki <macro(a)mips.com> MIPS: Factor out NT_PRFPREG regset access helpers Maciej W. Rozycki <macro(a)mips.com> MIPS: Validate PR_SET_FP_MODE prctl(2) requests against the ABI of the task Bart Van Assche <bart.vanassche(a)wdc.com> IB/srpt: Disable RDMA access by the initiator Wolfgang Grandegger <wg(a)grandegger.com> can: gs_usb: fix return value of the "set_bittiming" callback Wanpeng Li <wanpeng.li(a)hotmail.com> KVM: Fix stack-out-of-bounds read in write_mmio Vasanthakumar Thiagarajan <vthiagar(a)qti.qualcomm.com> ath10k: rebuild crypto header in rx data frames David Spinadel <david.spinadel(a)intel.com> mac80211: Add RX flag to indicate ICV stripped Suren Baghdasaryan <surenb(a)google.com> dm bufio: fix shrinker scans when (nr_to_scan < retain_target) ------------- Diffstat: Documentation/ABI/testing/sysfs-devices-system-cpu | 16 + Documentation/kernel-parameters.txt | 49 +- Documentation/x86/pti.txt | 186 ++++++++ Makefile | 4 +- arch/arm/kvm/mmio.c | 6 +- arch/mips/kernel/process.c | 12 + arch/mips/kernel/ptrace.c | 147 ++++-- arch/x86/Kconfig | 14 + arch/x86/Makefile | 8 + arch/x86/crypto/aesni-intel_asm.S | 5 +- arch/x86/crypto/camellia-aesni-avx-asm_64.S | 3 +- arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 3 +- arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 3 +- arch/x86/entry/entry_32.S | 10 +- arch/x86/entry/entry_64.S | 10 +- arch/x86/include/asm/alternative.h | 4 +- arch/x86/include/asm/asm-prototypes.h | 25 ++ arch/x86/include/asm/asm.h | 11 + arch/x86/include/asm/cpufeature.h | 2 + arch/x86/include/asm/cpufeatures.h | 6 + arch/x86/include/asm/msr-index.h | 3 + arch/x86/include/asm/nospec-branch.h | 214 +++++++++ arch/x86/include/asm/processor.h | 4 +- arch/x86/include/asm/thread_info.h | 11 - arch/x86/include/asm/xen/hypercall.h | 5 +- arch/x86/kernel/acpi/boot.c | 61 ++- arch/x86/kernel/alternative.c | 7 +- arch/x86/kernel/cpu/Makefile | 4 +- arch/x86/kernel/cpu/amd.c | 28 +- arch/x86/kernel/cpu/bugs.c | 219 ++++++++- arch/x86/kernel/cpu/bugs_64.c | 33 -- arch/x86/kernel/cpu/common.c | 39 +- arch/x86/kernel/cpu/microcode/intel.c | 13 +- arch/x86/kernel/irq_32.c | 15 +- arch/x86/kernel/mcount_64.S | 7 +- arch/x86/kernel/traps.c | 2 +- arch/x86/kvm/svm.c | 23 + arch/x86/kvm/vmx.c | 30 +- arch/x86/kvm/x86.c | 8 +- arch/x86/lib/Makefile | 1 + arch/x86/lib/checksum_32.S | 7 +- arch/x86/lib/retpoline.S | 48 ++ arch/x86/mm/kaiser.c | 2 + arch/x86/mm/tlb.c | 2 +- crypto/algapi.c | 12 + drivers/base/Kconfig | 3 + drivers/base/cpu.c | 48 ++ drivers/block/rbd.c | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 2 + drivers/hv/hv.c | 11 +- drivers/infiniband/ulp/srpt/ib_srpt.c | 3 +- drivers/md/dm-bufio.c | 7 +- drivers/net/can/usb/gs_usb.c | 2 +- drivers/net/ethernet/intel/e1000e/ich8lan.c | 11 +- .../net/ethernet/mellanox/mlxsw/spectrum_router.c | 4 +- drivers/net/ethernet/renesas/sh_eth.c | 29 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 + drivers/net/usb/cx82310_eth.c | 7 +- drivers/net/usb/lan78xx.c | 9 +- drivers/net/usb/smsc75xx.c | 8 +- drivers/net/usb/sr9700.c | 9 +- drivers/net/wireless/ath/ath10k/htt_rx.c | 105 ++++- drivers/net/wireless/ath/ath10k/rx_desc.h | 3 + drivers/platform/x86/wmi.c | 2 +- drivers/staging/android/ashmem.c | 2 + drivers/target/iscsi/iscsi_target.c | 20 +- drivers/target/target_core_tmr.c | 9 + drivers/target/target_core_transport.c | 2 + drivers/usb/host/xhci-mem.c | 3 +- drivers/usb/misc/usb3503.c | 2 + drivers/usb/mon/mon_bin.c | 8 +- drivers/usb/serial/cp210x.c | 2 + drivers/usb/storage/unusual_uas.h | 7 + drivers/usb/usbip/usbip_common.c | 17 +- drivers/usb/usbip/vudc_rx.c | 19 + drivers/usb/usbip/vudc_tx.c | 11 +- include/linux/bpf.h | 2 + include/linux/bpf_verifier.h | 5 +- include/linux/cpu.h | 7 + include/linux/frame.h | 2 +- include/linux/phy.h | 11 + include/linux/sh_eth.h | 1 - include/net/mac80211.h | 5 +- include/target/target_core_base.h | 1 + include/trace/events/kvm.h | 7 +- kernel/bpf/arraymap.c | 45 +- kernel/bpf/syscall.c | 54 --- kernel/bpf/verifier.c | 89 +++- mm/zswap.c | 12 +- net/8021q/vlan.c | 7 +- net/bluetooth/l2cap_core.c | 20 +- net/core/ethtool.c | 15 +- net/core/sock_diag.c | 2 +- net/ipv6/ip6_output.c | 5 +- net/ipv6/ip6_tunnel.c | 9 +- net/mac80211/wep.c | 3 +- net/mac80211/wpa.c | 3 +- net/rds/rdma.c | 4 + net/sched/act_gact.c | 2 +- net/sched/act_mirred.c | 2 +- scripts/mod/modpost.c | 1 + scripts/module-common.lds | 5 +- sound/core/oss/pcm_oss.c | 41 +- sound/core/oss/pcm_plugin.c | 14 +- sound/core/pcm_lib.c | 4 +- sound/drivers/aloop.c | 98 ++-- tools/objtool/builtin-check.c | 73 ++- tools/testing/selftests/x86/test_vsyscall.c | 500 +++++++++++++++++++++ 108 files changed, 2286 insertions(+), 458 deletions(-)

7 years, 8 months

7
103
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror