- Linux-stable-mirror - lists.linaro.org

[Linux-stable-mirror] v4.14.10 build: 0 failures 0 warnings (v4.14.10)

by Build bot for Mark Brown

Tree/Branch: v4.14.10 Git describe: v4.14.10 Commit: b8ce8232fc Linux 4.14.10 Build Time: 119 min 28 sec Passed: 10 / 10 (100.00 %) Failed: 0 / 10 ( 0.00 %) Errors: 0 Warnings: 0 Section Mismatches: 0 ------------------------------------------------------------------------------- defconfigs with issues (other than build errors): ------------------------------------------------------------------------------- =============================================================================== Detailed per-defconfig build reports below: ------------------------------------------------------------------------------- Passed with no errors, warnings or mismatches: arm64-allnoconfig arm64-allmodconfig arm-multi_v5_defconfig arm-multi_v7_defconfig x86_64-defconfig arm-allmodconfig arm-allnoconfig x86_64-allnoconfig arm-multi_v4t_defconfig arm64-defconfig close failed in file object destructor: sys.excepthook is missing lost sys.stderr

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] 4.14.9 with CONFIG_MCORE2 fails to boot

by Alexander Tsoy

Hello, 4.14.9 fails to boot if CONFIG_MCORE2 is enabled and when compiled with gcc 6+. More details in the following bug reports: https://bugzilla.kernel.org/show_bug.cgi?id=198263 https://bugs.gentoo.org/642268 I bisected it to the commit below: $ git bisect good 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a is the first bad commit commit 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a Author: Andy Lutomirski <luto(a)kernel.org> Date: Mon Dec 4 15:07:23 2017 +0100 x86/entry/64: Use a per-CPU trampoline stack for IDT entries commit 7f2590a110b837af5679d08fc25c6227c5a8c497 upstream. Historically, IDT entries from usermode have always gone directly to the running task's kernel stack. Rearrange it so that we enter on a per-CPU trampoline stack and then manually switch to the task's stack. This touches a couple of extra cachelines, but it gives us a chance to run some code before we touch the kernel stack. The asm isn't exactly beautiful, but I think that fully refactoring it can wait. Signed-off-by: Andy Lutomirski <luto(a)kernel.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Reviewed-by: Borislav Petkov <bp(a)suse.de> Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Borislav Petkov <bpetkov(a)suse.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: Eduardo Valentin <eduval(a)amazon.com> Cc: Greg KH <gregkh(a)linuxfoundation.org> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Rik van Riel <riel(a)redhat.com> Cc: Will Deacon <will.deacon(a)arm.com> Cc: aliguori(a)amazon.com Cc: daniel.gruss(a)iaik.tugraz.at Cc: hughd(a)google.com Cc: keescook(a)google.com Link: https://lkml.kernel.org/r/20171204150606.225330557@linutronix .de Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> :040000 040000 275d4746936a9e521a2b5041856f7dc1d1820dc6 8f8e869fd59c3dd781dceffa76e53e41d733a0cf M arch $ git bisect log git bisect start # bad: [dad5c1402c570cd07a80113784bc20a7f930c8ae] Linux 4.14.9 git bisect bad dad5c1402c570cd07a80113784bc20a7f930c8ae # good: [7b3775017f4e6b87dfd2c7f63d1eaf057948f31d] Linux 4.14.8 git bisect good 7b3775017f4e6b87dfd2c7f63d1eaf057948f31d # good: [d120cd749ef9770ee98b708a83b49547dcf1c0e1] x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 git bisect good d120cd749ef9770ee98b708a83b49547dcf1c0e1 # bad: [97f41b41c432e5a80c91445d92c2f4b729984d36] powerpc/xmon: Avoid tripping SMP hardlockup watchdog git bisect bad 97f41b41c432e5a80c91445d92c2f4b729984d36 # bad: [bfd66a406fe7e590055c1d6714adc697f18664c8] PCI: Avoid bus reset if bridge itself is broken git bisect bad bfd66a406fe7e590055c1d6714adc697f18664c8 # bad: [8388d287e361a2fd0a39bece30a736d692d5c3d8] x86/cpufeatures: Make CPU bugs sticky git bisect bad 8388d287e361a2fd0a39bece30a736d692d5c3d8 # bad: [bb568391775d4a840992e2d2493f39d6e86401e3] x86/entry/64: Move the IST stacks into struct cpu_entry_area git bisect bad bb568391775d4a840992e2d2493f39d6e86401e3 # bad: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a per-CPU trampoline stack for IDT entries git bisect bad 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a # good: [c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e] x86/espfix/64: Stop assuming that pt_regs is on the entry stack git bisect good c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e # first bad commit: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a per-CPU trampoline stack for IDT entries

7 years, 8 months

5
16
0 0

[Linux-stable-mirror] v4.9.73 build: 0 failures 0 warnings (v4.9.73)

by Build bot for Mark Brown

Tree/Branch: v4.9.73 Git describe: v4.9.73 Commit: b3e88217e2 Linux 4.9.73 Build Time: 101 min 5 sec Passed: 10 / 10 (100.00 %) Failed: 0 / 10 ( 0.00 %) Errors: 0 Warnings: 0 Section Mismatches: 0 ------------------------------------------------------------------------------- defconfigs with issues (other than build errors): ------------------------------------------------------------------------------- =============================================================================== Detailed per-defconfig build reports below: ------------------------------------------------------------------------------- Passed with no errors, warnings or mismatches: arm64-allnoconfig arm64-allmodconfig arm-multi_v5_defconfig arm-multi_v7_defconfig x86_64-defconfig arm-allmodconfig arm-allnoconfig x86_64-allnoconfig arm-multi_v4t_defconfig arm64-defconfig close failed in file object destructor: sys.excepthook is missing lost sys.stderr

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "tracing: Fix possible double free on failure of allocating trace buffer" has been added to the 4.9-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled tracing: Fix possible double free on failure of allocating trace buffer to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 4397f04575c44e1440ec2e49b6302785c95fd2f8 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> Date: Tue, 26 Dec 2017 20:07:34 -0500 Subject: tracing: Fix possible double free on failure of allocating trace buffer From: Steven Rostedt (VMware) <rostedt(a)goodmis.org> commit 4397f04575c44e1440ec2e49b6302785c95fd2f8 upstream. Jing Xia and Chunyan Zhang reported that on failing to allocate part of the tracing buffer, memory is freed, but the pointers that point to them are not initialized back to NULL, and later paths may try to free the freed memory again. Jing and Chunyan fixed one of the locations that does this, but missed a spot. Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Reported-by: Jing Xia <jing.xia(a)spreadtrum.com> Reported-by: Chunyan Zhang <chunyan.zhang(a)spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/trace/trace.c | 1 + 1 file changed, 1 insertion(+) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -6955,6 +6955,7 @@ allocate_trace_buffer(struct trace_array buf->data = alloc_percpu(struct trace_array_cpu); if (!buf->data) { ring_buffer_free(buf->buffer); + buf->buffer = NULL; return -ENOMEM; } Patches currently in stable-queue which might be from rostedt(a)goodmis.org are queue-4.9/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch queue-4.9/tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch queue-4.9/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "tracing: Remove extra zeroing out of the ring buffer page" has been added to the 4.9-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled tracing: Remove extra zeroing out of the ring buffer page to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 6b7e633fe9c24682df550e5311f47fb524701586 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> Date: Fri, 22 Dec 2017 20:38:57 -0500 Subject: tracing: Remove extra zeroing out of the ring buffer page From: Steven Rostedt (VMware) <rostedt(a)goodmis.org> commit 6b7e633fe9c24682df550e5311f47fb524701586 upstream. The ring_buffer_read_page() takes care of zeroing out any extra data in the page that it returns. There's no need to zero it out again from the consumer. It was removed from one consumer of this function, but read_buffers_splice_read() did not remove it, and worse, it contained a nasty bug because of it. Fixes: 2711ca237a084 ("ring-buffer: Move zeroing out excess in page to ring buffer code") Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/trace/trace.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -6181,7 +6181,7 @@ tracing_buffers_splice_read(struct file .spd_release = buffer_spd_release, }; struct buffer_ref *ref; - int entries, size, i; + int entries, i; ssize_t ret = 0; #ifdef CONFIG_TRACER_MAX_TRACE @@ -6232,14 +6232,6 @@ tracing_buffers_splice_read(struct file break; } - /* - * zero out any left over data, this is going to - * user land. - */ - size = ring_buffer_page_len(ref->page); - if (size < PAGE_SIZE) - memset(ref->page + size, 0, PAGE_SIZE - size); - page = virt_to_page(ref->page); spd.pages[i] = page; Patches currently in stable-queue which might be from rostedt(a)goodmis.org are queue-4.9/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch queue-4.9/tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch queue-4.9/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "tracing: Fix crash when it fails to alloc ring buffer" has been added to the 4.9-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled tracing: Fix crash when it fails to alloc ring buffer to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 24f2aaf952ee0b59f31c3a18b8b36c9e3d3c2cf5 Mon Sep 17 00:00:00 2001 From: Jing Xia <jing.xia(a)spreadtrum.com> Date: Tue, 26 Dec 2017 15:12:53 +0800 Subject: tracing: Fix crash when it fails to alloc ring buffer From: Jing Xia <jing.xia(a)spreadtrum.com> commit 24f2aaf952ee0b59f31c3a18b8b36c9e3d3c2cf5 upstream. Double free of the ring buffer happens when it fails to alloc new ring buffer instance for max_buffer if TRACER_MAX_TRACE is configured. The root cause is that the pointer is not set to NULL after the buffer is freed in allocate_trace_buffers(), and the freeing of the ring buffer is invoked again later if the pointer is not equal to Null, as: instance_mkdir() |-allocate_trace_buffers() |-allocate_trace_buffer(tr, &tr->trace_buffer...) |-allocate_trace_buffer(tr, &tr->max_buffer...) // allocate fail(-ENOMEM),first free // and the buffer pointer is not set to null |-ring_buffer_free(tr->trace_buffer.buffer) // out_free_tr |-free_trace_buffers() |-free_trace_buffer(&tr->trace_buffer); //if trace_buffer is not null, free again |-ring_buffer_free(buf->buffer) |-rb_free_cpu_buffer(buffer->buffers[cpu]) // ring_buffer_per_cpu is null, and // crash in ring_buffer_per_cpu->pages Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Signed-off-by: Jing Xia <jing.xia(a)spreadtrum.com> Signed-off-by: Chunyan Zhang <chunyan.zhang(a)spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/trace/trace.c | 2 ++ 1 file changed, 2 insertions(+) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -6979,7 +6979,9 @@ static int allocate_trace_buffers(struct allocate_snapshot ? size : 1); if (WARN_ON(ret)) { ring_buffer_free(tr->trace_buffer.buffer); + tr->trace_buffer.buffer = NULL; free_percpu(tr->trace_buffer.data); + tr->trace_buffer.data = NULL; return -ENOMEM; } tr->allocated_snapshot = allocate_snapshot; Patches currently in stable-queue which might be from jing.xia(a)spreadtrum.com are queue-4.9/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch queue-4.9/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "tracing: Fix possible double free on failure of allocating trace buffer" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled tracing: Fix possible double free on failure of allocating trace buffer to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 4397f04575c44e1440ec2e49b6302785c95fd2f8 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> Date: Tue, 26 Dec 2017 20:07:34 -0500 Subject: tracing: Fix possible double free on failure of allocating trace buffer From: Steven Rostedt (VMware) <rostedt(a)goodmis.org> commit 4397f04575c44e1440ec2e49b6302785c95fd2f8 upstream. Jing Xia and Chunyan Zhang reported that on failing to allocate part of the tracing buffer, memory is freed, but the pointers that point to them are not initialized back to NULL, and later paths may try to free the freed memory again. Jing and Chunyan fixed one of the locations that does this, but missed a spot. Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Reported-by: Jing Xia <jing.xia(a)spreadtrum.com> Reported-by: Chunyan Zhang <chunyan.zhang(a)spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/trace/trace.c | 1 + 1 file changed, 1 insertion(+) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -6531,6 +6531,7 @@ allocate_trace_buffer(struct trace_array buf->data = alloc_percpu(struct trace_array_cpu); if (!buf->data) { ring_buffer_free(buf->buffer); + buf->buffer = NULL; return -ENOMEM; } Patches currently in stable-queue which might be from rostedt(a)goodmis.org are queue-4.4/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch queue-4.4/tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch queue-4.4/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "tracing: Remove extra zeroing out of the ring buffer page" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled tracing: Remove extra zeroing out of the ring buffer page to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 6b7e633fe9c24682df550e5311f47fb524701586 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> Date: Fri, 22 Dec 2017 20:38:57 -0500 Subject: tracing: Remove extra zeroing out of the ring buffer page From: Steven Rostedt (VMware) <rostedt(a)goodmis.org> commit 6b7e633fe9c24682df550e5311f47fb524701586 upstream. The ring_buffer_read_page() takes care of zeroing out any extra data in the page that it returns. There's no need to zero it out again from the consumer. It was removed from one consumer of this function, but read_buffers_splice_read() did not remove it, and worse, it contained a nasty bug because of it. Fixes: 2711ca237a084 ("ring-buffer: Move zeroing out excess in page to ring buffer code") Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/trace/trace.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5754,7 +5754,7 @@ tracing_buffers_splice_read(struct file .spd_release = buffer_spd_release, }; struct buffer_ref *ref; - int entries, size, i; + int entries, i; ssize_t ret = 0; #ifdef CONFIG_TRACER_MAX_TRACE @@ -5805,14 +5805,6 @@ tracing_buffers_splice_read(struct file break; } - /* - * zero out any left over data, this is going to - * user land. - */ - size = ring_buffer_page_len(ref->page); - if (size < PAGE_SIZE) - memset(ref->page + size, 0, PAGE_SIZE - size); - page = virt_to_page(ref->page); spd.pages[i] = page; Patches currently in stable-queue which might be from rostedt(a)goodmis.org are queue-4.4/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch queue-4.4/tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch queue-4.4/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "tracing: Fix crash when it fails to alloc ring buffer" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled tracing: Fix crash when it fails to alloc ring buffer to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 24f2aaf952ee0b59f31c3a18b8b36c9e3d3c2cf5 Mon Sep 17 00:00:00 2001 From: Jing Xia <jing.xia(a)spreadtrum.com> Date: Tue, 26 Dec 2017 15:12:53 +0800 Subject: tracing: Fix crash when it fails to alloc ring buffer From: Jing Xia <jing.xia(a)spreadtrum.com> commit 24f2aaf952ee0b59f31c3a18b8b36c9e3d3c2cf5 upstream. Double free of the ring buffer happens when it fails to alloc new ring buffer instance for max_buffer if TRACER_MAX_TRACE is configured. The root cause is that the pointer is not set to NULL after the buffer is freed in allocate_trace_buffers(), and the freeing of the ring buffer is invoked again later if the pointer is not equal to Null, as: instance_mkdir() |-allocate_trace_buffers() |-allocate_trace_buffer(tr, &tr->trace_buffer...) |-allocate_trace_buffer(tr, &tr->max_buffer...) // allocate fail(-ENOMEM),first free // and the buffer pointer is not set to null |-ring_buffer_free(tr->trace_buffer.buffer) // out_free_tr |-free_trace_buffers() |-free_trace_buffer(&tr->trace_buffer); //if trace_buffer is not null, free again |-ring_buffer_free(buf->buffer) |-rb_free_cpu_buffer(buffer->buffers[cpu]) // ring_buffer_per_cpu is null, and // crash in ring_buffer_per_cpu->pages Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Signed-off-by: Jing Xia <jing.xia(a)spreadtrum.com> Signed-off-by: Chunyan Zhang <chunyan.zhang(a)spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/trace/trace.c | 2 ++ 1 file changed, 2 insertions(+) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -6555,7 +6555,9 @@ static int allocate_trace_buffers(struct allocate_snapshot ? size : 1); if (WARN_ON(ret)) { ring_buffer_free(tr->trace_buffer.buffer); + tr->trace_buffer.buffer = NULL; free_percpu(tr->trace_buffer.data); + tr->trace_buffer.data = NULL; return -ENOMEM; } tr->allocated_snapshot = allocate_snapshot; Patches currently in stable-queue which might be from jing.xia(a)spreadtrum.com are queue-4.4/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch queue-4.4/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "tracing: Remove extra zeroing out of the ring buffer page" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled tracing: Remove extra zeroing out of the ring buffer page to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 6b7e633fe9c24682df550e5311f47fb524701586 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> Date: Fri, 22 Dec 2017 20:38:57 -0500 Subject: tracing: Remove extra zeroing out of the ring buffer page From: Steven Rostedt (VMware) <rostedt(a)goodmis.org> commit 6b7e633fe9c24682df550e5311f47fb524701586 upstream. The ring_buffer_read_page() takes care of zeroing out any extra data in the page that it returns. There's no need to zero it out again from the consumer. It was removed from one consumer of this function, but read_buffers_splice_read() did not remove it, and worse, it contained a nasty bug because of it. Fixes: 2711ca237a084 ("ring-buffer: Move zeroing out excess in page to ring buffer code") Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/trace/trace.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -6769,7 +6769,7 @@ tracing_buffers_splice_read(struct file .spd_release = buffer_spd_release, }; struct buffer_ref *ref; - int entries, size, i; + int entries, i; ssize_t ret = 0; #ifdef CONFIG_TRACER_MAX_TRACE @@ -6823,14 +6823,6 @@ tracing_buffers_splice_read(struct file break; } - /* - * zero out any left over data, this is going to - * user land. - */ - size = ring_buffer_page_len(ref->page); - if (size < PAGE_SIZE) - memset(ref->page + size, 0, PAGE_SIZE - size); - page = virt_to_page(ref->page); spd.pages[i] = page; Patches currently in stable-queue which might be from rostedt(a)goodmis.org are queue-4.14/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch queue-4.14/tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch queue-4.14/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "tracing: Fix possible double free on failure of allocating trace buffer" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled tracing: Fix possible double free on failure of allocating trace buffer to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 4397f04575c44e1440ec2e49b6302785c95fd2f8 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> Date: Tue, 26 Dec 2017 20:07:34 -0500 Subject: tracing: Fix possible double free on failure of allocating trace buffer From: Steven Rostedt (VMware) <rostedt(a)goodmis.org> commit 4397f04575c44e1440ec2e49b6302785c95fd2f8 upstream. Jing Xia and Chunyan Zhang reported that on failing to allocate part of the tracing buffer, memory is freed, but the pointers that point to them are not initialized back to NULL, and later paths may try to free the freed memory again. Jing and Chunyan fixed one of the locations that does this, but missed a spot. Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Reported-by: Jing Xia <jing.xia(a)spreadtrum.com> Reported-by: Chunyan Zhang <chunyan.zhang(a)spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/trace/trace.c | 1 + 1 file changed, 1 insertion(+) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -7580,6 +7580,7 @@ allocate_trace_buffer(struct trace_array buf->data = alloc_percpu(struct trace_array_cpu); if (!buf->data) { ring_buffer_free(buf->buffer); + buf->buffer = NULL; return -ENOMEM; } Patches currently in stable-queue which might be from rostedt(a)goodmis.org are queue-4.14/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch queue-4.14/tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch queue-4.14/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "tracing: Fix crash when it fails to alloc ring buffer" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled tracing: Fix crash when it fails to alloc ring buffer to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 24f2aaf952ee0b59f31c3a18b8b36c9e3d3c2cf5 Mon Sep 17 00:00:00 2001 From: Jing Xia <jing.xia(a)spreadtrum.com> Date: Tue, 26 Dec 2017 15:12:53 +0800 Subject: tracing: Fix crash when it fails to alloc ring buffer From: Jing Xia <jing.xia(a)spreadtrum.com> commit 24f2aaf952ee0b59f31c3a18b8b36c9e3d3c2cf5 upstream. Double free of the ring buffer happens when it fails to alloc new ring buffer instance for max_buffer if TRACER_MAX_TRACE is configured. The root cause is that the pointer is not set to NULL after the buffer is freed in allocate_trace_buffers(), and the freeing of the ring buffer is invoked again later if the pointer is not equal to Null, as: instance_mkdir() |-allocate_trace_buffers() |-allocate_trace_buffer(tr, &tr->trace_buffer...) |-allocate_trace_buffer(tr, &tr->max_buffer...) // allocate fail(-ENOMEM),first free // and the buffer pointer is not set to null |-ring_buffer_free(tr->trace_buffer.buffer) // out_free_tr |-free_trace_buffers() |-free_trace_buffer(&tr->trace_buffer); //if trace_buffer is not null, free again |-ring_buffer_free(buf->buffer) |-rb_free_cpu_buffer(buffer->buffers[cpu]) // ring_buffer_per_cpu is null, and // crash in ring_buffer_per_cpu->pages Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Signed-off-by: Jing Xia <jing.xia(a)spreadtrum.com> Signed-off-by: Chunyan Zhang <chunyan.zhang(a)spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/trace/trace.c | 2 ++ 1 file changed, 2 insertions(+) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -7604,7 +7604,9 @@ static int allocate_trace_buffers(struct allocate_snapshot ? size : 1); if (WARN_ON(ret)) { ring_buffer_free(tr->trace_buffer.buffer); + tr->trace_buffer.buffer = NULL; free_percpu(tr->trace_buffer.data); + tr->trace_buffer.data = NULL; return -ENOMEM; } tr->allocated_snapshot = allocate_snapshot; Patches currently in stable-queue which might be from jing.xia(a)spreadtrum.com are queue-4.14/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch queue-4.14/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "tracing: Fix possible double free on failure of allocating trace buffer" has been added to the 3.18-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled tracing: Fix possible double free on failure of allocating trace buffer to the 3.18-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch and it can be found in the queue-3.18 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 4397f04575c44e1440ec2e49b6302785c95fd2f8 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> Date: Tue, 26 Dec 2017 20:07:34 -0500 Subject: tracing: Fix possible double free on failure of allocating trace buffer From: Steven Rostedt (VMware) <rostedt(a)goodmis.org> commit 4397f04575c44e1440ec2e49b6302785c95fd2f8 upstream. Jing Xia and Chunyan Zhang reported that on failing to allocate part of the tracing buffer, memory is freed, but the pointers that point to them are not initialized back to NULL, and later paths may try to free the freed memory again. Jing and Chunyan fixed one of the locations that does this, but missed a spot. Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Reported-by: Jing Xia <jing.xia(a)spreadtrum.com> Reported-by: Chunyan Zhang <chunyan.zhang(a)spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/trace/trace.c | 1 + 1 file changed, 1 insertion(+) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -6244,6 +6244,7 @@ allocate_trace_buffer(struct trace_array buf->data = alloc_percpu(struct trace_array_cpu); if (!buf->data) { ring_buffer_free(buf->buffer); + buf->buffer = NULL; return -ENOMEM; } Patches currently in stable-queue which might be from rostedt(a)goodmis.org are queue-3.18/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch queue-3.18/tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch queue-3.18/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "tracing: Remove extra zeroing out of the ring buffer page" has been added to the 3.18-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled tracing: Remove extra zeroing out of the ring buffer page to the 3.18-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch and it can be found in the queue-3.18 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 6b7e633fe9c24682df550e5311f47fb524701586 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> Date: Fri, 22 Dec 2017 20:38:57 -0500 Subject: tracing: Remove extra zeroing out of the ring buffer page From: Steven Rostedt (VMware) <rostedt(a)goodmis.org> commit 6b7e633fe9c24682df550e5311f47fb524701586 upstream. The ring_buffer_read_page() takes care of zeroing out any extra data in the page that it returns. There's no need to zero it out again from the consumer. It was removed from one consumer of this function, but read_buffers_splice_read() did not remove it, and worse, it contained a nasty bug because of it. Fixes: 2711ca237a084 ("ring-buffer: Move zeroing out excess in page to ring buffer code") Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/trace/trace.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5502,7 +5502,7 @@ tracing_buffers_splice_read(struct file .spd_release = buffer_spd_release, }; struct buffer_ref *ref; - int entries, size, i; + int entries, i; ssize_t ret = 0; mutex_lock(&trace_types_lock); @@ -5563,14 +5563,6 @@ tracing_buffers_splice_read(struct file break; } - /* - * zero out any left over data, this is going to - * user land. - */ - size = ring_buffer_page_len(ref->page); - if (size < PAGE_SIZE) - memset(ref->page + size, 0, PAGE_SIZE - size); - page = virt_to_page(ref->page); spd.pages[i] = page; Patches currently in stable-queue which might be from rostedt(a)goodmis.org are queue-3.18/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch queue-3.18/tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch queue-3.18/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "tracing: Fix crash when it fails to alloc ring buffer" has been added to the 3.18-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled tracing: Fix crash when it fails to alloc ring buffer to the 3.18-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch and it can be found in the queue-3.18 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 24f2aaf952ee0b59f31c3a18b8b36c9e3d3c2cf5 Mon Sep 17 00:00:00 2001 From: Jing Xia <jing.xia(a)spreadtrum.com> Date: Tue, 26 Dec 2017 15:12:53 +0800 Subject: tracing: Fix crash when it fails to alloc ring buffer From: Jing Xia <jing.xia(a)spreadtrum.com> commit 24f2aaf952ee0b59f31c3a18b8b36c9e3d3c2cf5 upstream. Double free of the ring buffer happens when it fails to alloc new ring buffer instance for max_buffer if TRACER_MAX_TRACE is configured. The root cause is that the pointer is not set to NULL after the buffer is freed in allocate_trace_buffers(), and the freeing of the ring buffer is invoked again later if the pointer is not equal to Null, as: instance_mkdir() |-allocate_trace_buffers() |-allocate_trace_buffer(tr, &tr->trace_buffer...) |-allocate_trace_buffer(tr, &tr->max_buffer...) // allocate fail(-ENOMEM),first free // and the buffer pointer is not set to null |-ring_buffer_free(tr->trace_buffer.buffer) // out_free_tr |-free_trace_buffers() |-free_trace_buffer(&tr->trace_buffer); //if trace_buffer is not null, free again |-ring_buffer_free(buf->buffer) |-rb_free_cpu_buffer(buffer->buffers[cpu]) // ring_buffer_per_cpu is null, and // crash in ring_buffer_per_cpu->pages Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Signed-off-by: Jing Xia <jing.xia(a)spreadtrum.com> Signed-off-by: Chunyan Zhang <chunyan.zhang(a)spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- kernel/trace/trace.c | 2 ++ 1 file changed, 2 insertions(+) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -6268,7 +6268,9 @@ static int allocate_trace_buffers(struct allocate_snapshot ? size : 1); if (WARN_ON(ret)) { ring_buffer_free(tr->trace_buffer.buffer); + tr->trace_buffer.buffer = NULL; free_percpu(tr->trace_buffer.data); + tr->trace_buffer.data = NULL; return -ENOMEM; } tr->allocated_snapshot = allocate_snapshot; Patches currently in stable-queue which might be from jing.xia(a)spreadtrum.com are queue-3.18/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch queue-3.18/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] 4.14.10-rc : make[2]: execvp: ./sync-check.sh: Permission denied

by Toralf Förster

I do get: CC /usr/src/linux-4.x/tools/objtool/libstring.o HOSTLLD -shared scripts/gcc-plugins/latent_entropy_plugin.so HOSTCXX -fPIC scripts/gcc-plugins/randomize_layout_plugin.o CC /usr/src/linux-4.x/tools/objtool/str_error_r.o LD /usr/src/linux-4.x/tools/objtool/objtool-in.o HOSTLLD -shared scripts/gcc-plugins/structleak_plugin.so make[2]: execvp: ./sync-check.sh: Permission denied make[2]: *** [Makefile:49: /usr/src/linux-4.x/tools/objtool/objtool] Error 127 make[1]: *** [Makefile:62: objtool] Error 2 make: *** [Makefile:1629: tools/objtool] Error 2 make: *** Waiting for unfinished jobs.... HOSTLLD -shared scripts/gcc-plugins/randomize_layout_plugin.so t -- Toralf PGP C4EACDDE 0076E94E

7 years, 8 months

2
10
0 0

[Linux-stable-mirror] [PATCH] video: fbdev: atmel_lcdfb: fix display-timings lookup

by Johan Hovold

Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at the parent rather than just matching on its children. To make things worse, the parent display node was also prematurely freed. Note that the display and timings node references are never put after a successful dt-initialisation so the nodes would leak on later probe deferrals and on driver unbind. Fixes: b985172b328a ("video: atmel_lcdfb: add device tree suport") Cc: stable <stable(a)vger.kernel.org> # 3.13 Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj(a)jcrosoft.com> Signed-off-by: Johan Hovold <johan(a)kernel.org> --- drivers/video/fbdev/atmel_lcdfb.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/video/fbdev/atmel_lcdfb.c b/drivers/video/fbdev/atmel_lcdfb.c index e06358da4b99..3dee267d7c75 100644 --- a/drivers/video/fbdev/atmel_lcdfb.c +++ b/drivers/video/fbdev/atmel_lcdfb.c @@ -1119,7 +1119,7 @@ static int atmel_lcdfb_of_init(struct atmel_lcdfb_info *sinfo) goto put_display_node; } - timings_np = of_find_node_by_name(display_np, "display-timings"); + timings_np = of_get_child_by_name(display_np, "display-timings"); if (!timings_np) { dev_err(dev, "failed to find display-timings node\n"); ret = -ENODEV; @@ -1140,6 +1140,12 @@ static int atmel_lcdfb_of_init(struct atmel_lcdfb_info *sinfo) fb_add_videomode(&fb_vm, &info->modelist); } + /* + * FIXME: Make sure we are not referencing any fields in display_np + * and timings_np and drop our references to them before returning to + * avoid leaking the nodes on probe deferral and driver unbind. + */ + return 0; put_timings_node: -- 2.15.0

7 years, 8 months

2
1
0 0

Re: [Linux-stable-mirror] Linux 4.14.10

by Greg KH

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt index 3448e675b462..51101708a03a 100644 --- a/Documentation/x86/x86_64/mm.txt +++ b/Documentation/x86/x86_64/mm.txt @@ -1,6 +1,4 @@ -<previous description obsolete, deleted> - Virtual memory map with 4 level page tables: 0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm @@ -14,13 +12,15 @@ ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB) ... unused hole ... ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB) ... unused hole ... +fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks ... unused hole ... ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space ... unused hole ... ffffffff80000000 - ffffffff9fffffff (=512 MB) kernel text mapping, from phys 0 -ffffffffa0000000 - ffffffffff5fffff (=1526 MB) module mapping space (variable) -ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls +ffffffffa0000000 - [fixmap start] (~1526 MB) module mapping space (variable) +[fixmap start] - ffffffffff5fffff kernel-internal fixmap range +ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole Virtual memory map with 5 level page tables: @@ -36,19 +36,22 @@ ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB) ... unused hole ... ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB) ... unused hole ... +fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks ... unused hole ... ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space ... unused hole ... ffffffff80000000 - ffffffff9fffffff (=512 MB) kernel text mapping, from phys 0 -ffffffffa0000000 - ffffffffff5fffff (=1526 MB) module mapping space -ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls +ffffffffa0000000 - [fixmap start] (~1526 MB) module mapping space +[fixmap start] - ffffffffff5fffff kernel-internal fixmap range +ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole Architecture defines a 64-bit virtual address. Implementations can support less. Currently supported are 48- and 57-bit virtual addresses. Bits 63 -through to the most-significant implemented bit are set to either all ones -or all zero. This causes hole between user space and kernel addresses. +through to the most-significant implemented bit are sign extended. +This causes hole between user space and kernel addresses if you interpret them +as unsigned. The direct mapping covers all memory in the system up to the highest memory address (this means in some cases it can also include PCI memory @@ -58,9 +61,6 @@ vmalloc space is lazily synchronized into the different PML4/PML5 pages of the processes using the page fault handler, with init_top_pgt as reference. -Current X86-64 implementations support up to 46 bits of address space (64 TB), -which is our current limit. This expands into MBZ space in the page tables. - We map EFI runtime services in the 'efi_pgd' PGD in a 64Gb large virtual memory window (this size is arbitrary, it can be raised later if needed). The mappings are not part of any other kernel PGD and are only available @@ -72,5 +72,3 @@ following fixmap section. Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all physical memory, vmalloc/ioremap space and virtual memory map are randomized. Their order is preserved but their base will be offset early at boot time. - --Andi Kleen, Jul 2004 diff --git a/Makefile b/Makefile index ed2132c6d286..9edfb78836a9 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 4 PATCHLEVEL = 14 -SUBLEVEL = 9 +SUBLEVEL = 10 EXTRAVERSION = NAME = Petit Gorille diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c index f5154ed3da6c..2add22699764 100644 --- a/arch/arm64/kvm/hyp/debug-sr.c +++ b/arch/arm64/kvm/hyp/debug-sr.c @@ -84,6 +84,9 @@ static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1) { u64 reg; + /* Clear pmscr in case of early return */ + *pmscr_el1 = 0; + /* SPE present on this CPU? */ if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1), ID_AA64DFR0_PMSVER_SHIFT)) diff --git a/arch/parisc/boot/compressed/misc.c b/arch/parisc/boot/compressed/misc.c index 9345b44b86f0..f57118e1f6b4 100644 --- a/arch/parisc/boot/compressed/misc.c +++ b/arch/parisc/boot/compressed/misc.c @@ -123,8 +123,8 @@ int puts(const char *s) while ((nuline = strchr(s, '\n')) != NULL) { if (nuline != s) pdc_iodc_print(s, nuline - s); - pdc_iodc_print("\r\n", 2); - s = nuline + 1; + pdc_iodc_print("\r\n", 2); + s = nuline + 1; } if (*s != '\0') pdc_iodc_print(s, strlen(s)); diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S index a4fd296c958e..f3cecf5117cf 100644 --- a/arch/parisc/kernel/entry.S +++ b/arch/parisc/kernel/entry.S @@ -878,9 +878,6 @@ ENTRY_CFI(syscall_exit_rfi) STREG %r19,PT_SR7(%r16) intr_return: - /* NOTE: Need to enable interrupts incase we schedule. */ - ssm PSW_SM_I, %r0 - /* check for reschedule */ mfctl %cr30,%r1 LDREG TI_FLAGS(%r1),%r19 /* sched.h: TIF_NEED_RESCHED */ @@ -907,6 +904,11 @@ intr_check_sig: LDREG PT_IASQ1(%r16), %r20 cmpib,COND(=),n 0,%r20,intr_restore /* backward */ + /* NOTE: We need to enable interrupts if we have to deliver + * signals. We used to do this earlier but it caused kernel + * stack overflows. */ + ssm PSW_SM_I, %r0 + copy %r0, %r25 /* long in_syscall = 0 */ #ifdef CONFIG_64BIT ldo -16(%r30),%r29 /* Reference param save area */ @@ -958,6 +960,10 @@ intr_do_resched: cmpib,COND(=) 0, %r20, intr_do_preempt nop + /* NOTE: We need to enable interrupts if we schedule. We used + * to do this earlier but it caused kernel stack overflows. */ + ssm PSW_SM_I, %r0 + #ifdef CONFIG_64BIT ldo -16(%r30),%r29 /* Reference param save area */ #endif diff --git a/arch/parisc/kernel/hpmc.S b/arch/parisc/kernel/hpmc.S index e3a8e5e4d5de..8d072c44f300 100644 --- a/arch/parisc/kernel/hpmc.S +++ b/arch/parisc/kernel/hpmc.S @@ -305,6 +305,7 @@ ENDPROC_CFI(os_hpmc) __INITRODATA + .align 4 .export os_hpmc_size os_hpmc_size: .word .os_hpmc_end-.os_hpmc diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 492d8140a395..44fdf4786638 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -114,9 +114,10 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, #endif } -static inline void arch_dup_mmap(struct mm_struct *oldmm, - struct mm_struct *mm) +static inline int arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) { + return 0; } static inline void arch_exit_mmap(struct mm_struct *mm) diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c index bf457843e032..0d750d274c4e 100644 --- a/arch/powerpc/kvm/book3s_xive.c +++ b/arch/powerpc/kvm/book3s_xive.c @@ -725,7 +725,8 @@ u64 kvmppc_xive_get_icp(struct kvm_vcpu *vcpu) /* Return the per-cpu state for state saving/migration */ return (u64)xc->cppr << KVM_REG_PPC_ICP_CPPR_SHIFT | - (u64)xc->mfrr << KVM_REG_PPC_ICP_MFRR_SHIFT; + (u64)xc->mfrr << KVM_REG_PPC_ICP_MFRR_SHIFT | + (u64)0xff << KVM_REG_PPC_ICP_PPRI_SHIFT; } int kvmppc_xive_set_icp(struct kvm_vcpu *vcpu, u64 icpval) @@ -1558,7 +1559,7 @@ static int xive_set_source(struct kvmppc_xive *xive, long irq, u64 addr) /* * Restore P and Q. If the interrupt was pending, we - * force both P and Q, which will trigger a resend. + * force Q and !P, which will trigger a resend. * * That means that a guest that had both an interrupt * pending (queued) and Q set will restore with only @@ -1566,7 +1567,7 @@ static int xive_set_source(struct kvmppc_xive *xive, long irq, u64 addr) * is perfectly fine as coalescing interrupts that haven't * been presented yet is always allowed. */ - if (val & KVM_XICS_PRESENTED || val & KVM_XICS_PENDING) + if (val & KVM_XICS_PRESENTED && !(val & KVM_XICS_PENDING)) state->old_p = true; if (val & KVM_XICS_QUEUED || val & KVM_XICS_PENDING) state->old_q = true; diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index 9e3da168d54c..b4209a68b85d 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -410,8 +410,12 @@ static __u64 power_pmu_bhrb_to(u64 addr) int ret; __u64 target; - if (is_kernel_addr(addr)) - return branch_target((unsigned int *)addr); + if (is_kernel_addr(addr)) { + if (probe_kernel_read(&instr, (void *)addr, sizeof(instr))) + return 0; + + return branch_target(&instr); + } /* Userspace: need copy instruction here then translate it */ pagefault_disable(); diff --git a/arch/um/include/asm/mmu_context.h b/arch/um/include/asm/mmu_context.h index b668e351fd6c..fca34b2177e2 100644 --- a/arch/um/include/asm/mmu_context.h +++ b/arch/um/include/asm/mmu_context.h @@ -15,9 +15,10 @@ extern void uml_setup_stubs(struct mm_struct *mm); /* * Needed since we do not use the asm-generic/mm_hooks.h: */ -static inline void arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm) +static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm) { uml_setup_stubs(mm); + return 0; } extern void arch_exit_mmap(struct mm_struct *mm); static inline void arch_unmap(struct mm_struct *mm, diff --git a/arch/unicore32/include/asm/mmu_context.h b/arch/unicore32/include/asm/mmu_context.h index 59b06b48f27d..5c205a9cb5a6 100644 --- a/arch/unicore32/include/asm/mmu_context.h +++ b/arch/unicore32/include/asm/mmu_context.h @@ -81,9 +81,10 @@ do { \ } \ } while (0) -static inline void arch_dup_mmap(struct mm_struct *oldmm, - struct mm_struct *mm) +static inline int arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) { + return 0; } static inline void arch_unmap(struct mm_struct *mm, diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 48646160eb83..592c974d4558 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -925,7 +925,8 @@ config MAXSMP config NR_CPUS int "Maximum number of CPUs" if SMP && !MAXSMP range 2 8 if SMP && X86_32 && !X86_BIGSMP - range 2 512 if SMP && !MAXSMP && !CPUMASK_OFFSTACK + range 2 64 if SMP && X86_32 && X86_BIGSMP + range 2 512 if SMP && !MAXSMP && !CPUMASK_OFFSTACK && X86_64 range 2 8192 if SMP && !MAXSMP && CPUMASK_OFFSTACK && X86_64 default "1" if !SMP default "8192" if MAXSMP diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index bd8b57a5c874..ace8f321a5a1 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -942,9 +942,9 @@ ENTRY(debug) /* Are we currently on the SYSENTER stack? */ movl PER_CPU_VAR(cpu_entry_area), %ecx - addl $CPU_ENTRY_AREA_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx - subl %eax, %ecx /* ecx = (end of SYSENTER_stack) - esp */ - cmpl $SIZEOF_SYSENTER_stack, %ecx + addl $CPU_ENTRY_AREA_entry_stack + SIZEOF_entry_stack, %ecx + subl %eax, %ecx /* ecx = (end of entry_stack) - esp */ + cmpl $SIZEOF_entry_stack, %ecx jb .Ldebug_from_sysenter_stack TRACE_IRQS_OFF @@ -986,9 +986,9 @@ ENTRY(nmi) /* Are we currently on the SYSENTER stack? */ movl PER_CPU_VAR(cpu_entry_area), %ecx - addl $CPU_ENTRY_AREA_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx - subl %eax, %ecx /* ecx = (end of SYSENTER_stack) - esp */ - cmpl $SIZEOF_SYSENTER_stack, %ecx + addl $CPU_ENTRY_AREA_entry_stack + SIZEOF_entry_stack, %ecx + subl %eax, %ecx /* ecx = (end of entry_stack) - esp */ + cmpl $SIZEOF_entry_stack, %ecx jb .Lnmi_from_sysenter_stack /* Not on SYSENTER stack. */ diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 6abe3fcaece9..22c891c3b78d 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -154,8 +154,8 @@ END(native_usergs_sysret64) _entry_trampoline - CPU_ENTRY_AREA_entry_trampoline(%rip) /* The top word of the SYSENTER stack is hot and is usable as scratch space. */ -#define RSP_SCRATCH CPU_ENTRY_AREA_SYSENTER_stack + \ - SIZEOF_SYSENTER_stack - 8 + CPU_ENTRY_AREA +#define RSP_SCRATCH CPU_ENTRY_AREA_entry_stack + \ + SIZEOF_entry_stack - 8 + CPU_ENTRY_AREA ENTRY(entry_SYSCALL_64_trampoline) UNWIND_HINT_EMPTY diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c index f279ba2643dc..1faf40f2dda9 100644 --- a/arch/x86/entry/vsyscall/vsyscall_64.c +++ b/arch/x86/entry/vsyscall/vsyscall_64.c @@ -37,6 +37,7 @@ #include <asm/unistd.h> #include <asm/fixmap.h> #include <asm/traps.h> +#include <asm/paravirt.h> #define CREATE_TRACE_POINTS #include "vsyscall_trace.h" @@ -138,6 +139,10 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address) WARN_ON_ONCE(address != regs->ip); + /* This should be unreachable in NATIVE mode. */ + if (WARN_ON(vsyscall_mode == NATIVE)) + return false; + if (vsyscall_mode == NONE) { warn_bad_vsyscall(KERN_INFO, regs, "vsyscall attempted with vsyscall=none"); @@ -329,16 +334,47 @@ int in_gate_area_no_mm(unsigned long addr) return vsyscall_mode != NONE && (addr & PAGE_MASK) == VSYSCALL_ADDR; } +/* + * The VSYSCALL page is the only user-accessible page in the kernel address + * range. Normally, the kernel page tables can have _PAGE_USER clear, but + * the tables covering VSYSCALL_ADDR need _PAGE_USER set if vsyscalls + * are enabled. + * + * Some day we may create a "minimal" vsyscall mode in which we emulate + * vsyscalls but leave the page not present. If so, we skip calling + * this. + */ +static void __init set_vsyscall_pgtable_user_bits(void) +{ + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + + pgd = pgd_offset_k(VSYSCALL_ADDR); + set_pgd(pgd, __pgd(pgd_val(*pgd) | _PAGE_USER)); + p4d = p4d_offset(pgd, VSYSCALL_ADDR); +#if CONFIG_PGTABLE_LEVELS >= 5 + p4d->p4d |= _PAGE_USER; +#endif + pud = pud_offset(p4d, VSYSCALL_ADDR); + set_pud(pud, __pud(pud_val(*pud) | _PAGE_USER)); + pmd = pmd_offset(pud, VSYSCALL_ADDR); + set_pmd(pmd, __pmd(pmd_val(*pmd) | _PAGE_USER)); +} + void __init map_vsyscall(void) { extern char __vsyscall_page; unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page); - if (vsyscall_mode != NONE) + if (vsyscall_mode != NONE) { __set_fixmap(VSYSCALL_PAGE, physaddr_vsyscall, vsyscall_mode == NATIVE ? PAGE_KERNEL_VSYSCALL : PAGE_KERNEL_VVAR); + set_vsyscall_pgtable_user_bits(); + } BUILD_BUG_ON((unsigned long)__fix_to_virt(VSYSCALL_PAGE) != (unsigned long)VSYSCALL_ADDR); diff --git a/arch/x86/include/asm/cpu_entry_area.h b/arch/x86/include/asm/cpu_entry_area.h new file mode 100644 index 000000000000..2fbc69a0916e --- /dev/null +++ b/arch/x86/include/asm/cpu_entry_area.h @@ -0,0 +1,68 @@ +// SPDX-License-Identifier: GPL-2.0 + +#ifndef _ASM_X86_CPU_ENTRY_AREA_H +#define _ASM_X86_CPU_ENTRY_AREA_H + +#include <linux/percpu-defs.h> +#include <asm/processor.h> + +/* + * cpu_entry_area is a percpu region that contains things needed by the CPU + * and early entry/exit code. Real types aren't used for all fields here + * to avoid circular header dependencies. + * + * Every field is a virtual alias of some other allocated backing store. + * There is no direct allocation of a struct cpu_entry_area. + */ +struct cpu_entry_area { + char gdt[PAGE_SIZE]; + + /* + * The GDT is just below entry_stack and thus serves (on x86_64) as + * a a read-only guard page. + */ + struct entry_stack_page entry_stack_page; + + /* + * On x86_64, the TSS is mapped RO. On x86_32, it's mapped RW because + * we need task switches to work, and task switches write to the TSS. + */ + struct tss_struct tss; + + char entry_trampoline[PAGE_SIZE]; + +#ifdef CONFIG_X86_64 + /* + * Exception stacks used for IST entries. + * + * In the future, this should have a separate slot for each stack + * with guard pages between them. + */ + char exception_stacks[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]; +#endif +}; + +#define CPU_ENTRY_AREA_SIZE (sizeof(struct cpu_entry_area)) +#define CPU_ENTRY_AREA_TOT_SIZE (CPU_ENTRY_AREA_SIZE * NR_CPUS) + +DECLARE_PER_CPU(struct cpu_entry_area *, cpu_entry_area); + +extern void setup_cpu_entry_areas(void); +extern void cea_set_pte(void *cea_vaddr, phys_addr_t pa, pgprot_t flags); + +#define CPU_ENTRY_AREA_RO_IDT CPU_ENTRY_AREA_BASE +#define CPU_ENTRY_AREA_PER_CPU (CPU_ENTRY_AREA_RO_IDT + PAGE_SIZE) + +#define CPU_ENTRY_AREA_RO_IDT_VADDR ((void *)CPU_ENTRY_AREA_RO_IDT) + +#define CPU_ENTRY_AREA_MAP_SIZE \ + (CPU_ENTRY_AREA_PER_CPU + CPU_ENTRY_AREA_TOT_SIZE - CPU_ENTRY_AREA_BASE) + +extern struct cpu_entry_area *get_cpu_entry_area(int cpu); + +static inline struct entry_stack *cpu_entry_stack(int cpu) +{ + return &get_cpu_entry_area(cpu)->entry_stack_page.stack; +} + +#endif diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h index 2ace1f90d138..bc359dd2f7f6 100644 --- a/arch/x86/include/asm/desc.h +++ b/arch/x86/include/asm/desc.h @@ -7,6 +7,7 @@ #include <asm/mmu.h> #include <asm/fixmap.h> #include <asm/irq_vectors.h> +#include <asm/cpu_entry_area.h> #include <linux/smp.h> #include <linux/percpu.h> diff --git a/arch/x86/include/asm/espfix.h b/arch/x86/include/asm/espfix.h index 0211029076ea..6777480d8a42 100644 --- a/arch/x86/include/asm/espfix.h +++ b/arch/x86/include/asm/espfix.h @@ -2,7 +2,7 @@ #ifndef _ASM_X86_ESPFIX_H #define _ASM_X86_ESPFIX_H -#ifdef CONFIG_X86_64 +#ifdef CONFIG_X86_ESPFIX64 #include <asm/percpu.h> @@ -11,7 +11,8 @@ DECLARE_PER_CPU_READ_MOSTLY(unsigned long, espfix_waddr); extern void init_espfix_bsp(void); extern void init_espfix_ap(int cpu); - -#endif /* CONFIG_X86_64 */ +#else +static inline void init_espfix_ap(int cpu) { } +#endif #endif /* _ASM_X86_ESPFIX_H */ diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index 94fc4fa14127..64c4a30e0d39 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -44,46 +44,6 @@ extern unsigned long __FIXADDR_TOP; PAGE_SIZE) #endif -/* - * cpu_entry_area is a percpu region in the fixmap that contains things - * needed by the CPU and early entry/exit code. Real types aren't used - * for all fields here to avoid circular header dependencies. - * - * Every field is a virtual alias of some other allocated backing store. - * There is no direct allocation of a struct cpu_entry_area. - */ -struct cpu_entry_area { - char gdt[PAGE_SIZE]; - - /* - * The GDT is just below SYSENTER_stack and thus serves (on x86_64) as - * a a read-only guard page. - */ - struct SYSENTER_stack_page SYSENTER_stack_page; - - /* - * On x86_64, the TSS is mapped RO. On x86_32, it's mapped RW because - * we need task switches to work, and task switches write to the TSS. - */ - struct tss_struct tss; - - char entry_trampoline[PAGE_SIZE]; - -#ifdef CONFIG_X86_64 - /* - * Exception stacks used for IST entries. - * - * In the future, this should have a separate slot for each stack - * with guard pages between them. - */ - char exception_stacks[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]; -#endif -}; - -#define CPU_ENTRY_AREA_PAGES (sizeof(struct cpu_entry_area) / PAGE_SIZE) - -extern void setup_cpu_entry_areas(void); - /* * Here we define all the compile-time 'special' virtual * addresses. The point is to have a constant address at @@ -123,7 +83,6 @@ enum fixed_addresses { FIX_IO_APIC_BASE_0, FIX_IO_APIC_BASE_END = FIX_IO_APIC_BASE_0 + MAX_IO_APICS - 1, #endif - FIX_RO_IDT, /* Virtual mapping for read-only IDT */ #ifdef CONFIG_X86_32 FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */ FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, @@ -139,9 +98,6 @@ enum fixed_addresses { #ifdef CONFIG_X86_INTEL_MID FIX_LNW_VRTC, #endif - /* Fixmap entries to remap the GDTs, one per processor. */ - FIX_CPU_ENTRY_AREA_TOP, - FIX_CPU_ENTRY_AREA_BOTTOM = FIX_CPU_ENTRY_AREA_TOP + (CPU_ENTRY_AREA_PAGES * NR_CPUS) - 1, #ifdef CONFIG_ACPI_APEI_GHES /* Used for GHES mapping from assorted contexts */ @@ -182,7 +138,7 @@ enum fixed_addresses { extern void reserve_top_address(unsigned long reserve); #define FIXADDR_SIZE (__end_of_permanent_fixed_addresses << PAGE_SHIFT) -#define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE) +#define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE) extern int fixmaps_set; @@ -230,30 +186,5 @@ void __init *early_memremap_decrypted_wp(resource_size_t phys_addr, void __early_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t flags); -static inline unsigned int __get_cpu_entry_area_page_index(int cpu, int page) -{ - BUILD_BUG_ON(sizeof(struct cpu_entry_area) % PAGE_SIZE != 0); - - return FIX_CPU_ENTRY_AREA_BOTTOM - cpu*CPU_ENTRY_AREA_PAGES - page; -} - -#define __get_cpu_entry_area_offset_index(cpu, offset) ({ \ - BUILD_BUG_ON(offset % PAGE_SIZE != 0); \ - __get_cpu_entry_area_page_index(cpu, offset / PAGE_SIZE); \ - }) - -#define get_cpu_entry_area_index(cpu, field) \ - __get_cpu_entry_area_offset_index((cpu), offsetof(struct cpu_entry_area, field)) - -static inline struct cpu_entry_area *get_cpu_entry_area(int cpu) -{ - return (struct cpu_entry_area *)__fix_to_virt(__get_cpu_entry_area_page_index(cpu, 0)); -} - -static inline struct SYSENTER_stack *cpu_SYSENTER_stack(int cpu) -{ - return &get_cpu_entry_area(cpu)->SYSENTER_stack_page.stack; -} - #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_FIXMAP_H */ diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h index 02aff0867211..1c78580e58be 100644 --- a/arch/x86/include/asm/inat.h +++ b/arch/x86/include/asm/inat.h @@ -97,6 +97,16 @@ #define INAT_MAKE_GROUP(grp) ((grp << INAT_GRP_OFFS) | INAT_MODRM) #define INAT_MAKE_IMM(imm) (imm << INAT_IMM_OFFS) +/* Identifiers for segment registers */ +#define INAT_SEG_REG_IGNORE 0 +#define INAT_SEG_REG_DEFAULT 1 +#define INAT_SEG_REG_CS 2 +#define INAT_SEG_REG_SS 3 +#define INAT_SEG_REG_DS 4 +#define INAT_SEG_REG_ES 5 +#define INAT_SEG_REG_FS 6 +#define INAT_SEG_REG_GS 7 + /* Attribute search APIs */ extern insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode); extern int inat_get_last_prefix_id(insn_byte_t last_pfx); diff --git a/arch/x86/include/asm/invpcid.h b/arch/x86/include/asm/invpcid.h new file mode 100644 index 000000000000..989cfa86de85 --- /dev/null +++ b/arch/x86/include/asm/invpcid.h @@ -0,0 +1,53 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_INVPCID +#define _ASM_X86_INVPCID + +static inline void __invpcid(unsigned long pcid, unsigned long addr, + unsigned long type) +{ + struct { u64 d[2]; } desc = { { pcid, addr } }; + + /* + * The memory clobber is because the whole point is to invalidate + * stale TLB entries and, especially if we're flushing global + * mappings, we don't want the compiler to reorder any subsequent + * memory accesses before the TLB flush. + * + * The hex opcode is invpcid (%ecx), %eax in 32-bit mode and + * invpcid (%rcx), %rax in long mode. + */ + asm volatile (".byte 0x66, 0x0f, 0x38, 0x82, 0x01" + : : "m" (desc), "a" (type), "c" (&desc) : "memory"); +} + +#define INVPCID_TYPE_INDIV_ADDR 0 +#define INVPCID_TYPE_SINGLE_CTXT 1 +#define INVPCID_TYPE_ALL_INCL_GLOBAL 2 +#define INVPCID_TYPE_ALL_NON_GLOBAL 3 + +/* Flush all mappings for a given pcid and addr, not including globals. */ +static inline void invpcid_flush_one(unsigned long pcid, + unsigned long addr) +{ + __invpcid(pcid, addr, INVPCID_TYPE_INDIV_ADDR); +} + +/* Flush all mappings for a given PCID, not including globals. */ +static inline void invpcid_flush_single_context(unsigned long pcid) +{ + __invpcid(pcid, 0, INVPCID_TYPE_SINGLE_CTXT); +} + +/* Flush all mappings, including globals, for all PCIDs. */ +static inline void invpcid_flush_all(void) +{ + __invpcid(0, 0, INVPCID_TYPE_ALL_INCL_GLOBAL); +} + +/* Flush all mappings for all PCIDs except globals. */ +static inline void invpcid_flush_all_nonglobals(void) +{ + __invpcid(0, 0, INVPCID_TYPE_ALL_NON_GLOBAL); +} + +#endif /* _ASM_X86_INVPCID */ diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 9ea26f167497..5ff3e8af2c20 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -3,6 +3,7 @@ #define _ASM_X86_MMU_H #include <linux/spinlock.h> +#include <linux/rwsem.h> #include <linux/mutex.h> #include <linux/atomic.h> @@ -27,7 +28,8 @@ typedef struct { atomic64_t tlb_gen; #ifdef CONFIG_MODIFY_LDT_SYSCALL - struct ldt_struct *ldt; + struct rw_semaphore ldt_usr_sem; + struct ldt_struct *ldt; #endif #ifdef CONFIG_X86_64 diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 6d16d15d09a0..5ede7cae1d67 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -57,11 +57,17 @@ struct ldt_struct { /* * Used for LDT copy/destruction. */ -int init_new_context_ldt(struct task_struct *tsk, struct mm_struct *mm); +static inline void init_new_context_ldt(struct mm_struct *mm) +{ + mm->context.ldt = NULL; + init_rwsem(&mm->context.ldt_usr_sem); +} +int ldt_dup_context(struct mm_struct *oldmm, struct mm_struct *mm); void destroy_context_ldt(struct mm_struct *mm); #else /* CONFIG_MODIFY_LDT_SYSCALL */ -static inline int init_new_context_ldt(struct task_struct *tsk, - struct mm_struct *mm) +static inline void init_new_context_ldt(struct mm_struct *mm) { } +static inline int ldt_dup_context(struct mm_struct *oldmm, + struct mm_struct *mm) { return 0; } @@ -132,18 +138,21 @@ void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk); static inline int init_new_context(struct task_struct *tsk, struct mm_struct *mm) { + mutex_init(&mm->context.lock); + mm->context.ctx_id = atomic64_inc_return(&last_mm_ctx_id); atomic64_set(&mm->context.tlb_gen, 0); - #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS +#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS if (cpu_feature_enabled(X86_FEATURE_OSPKE)) { /* pkey 0 is the default and always allocated */ mm->context.pkey_allocation_map = 0x1; /* -1 means unallocated or invalid */ mm->context.execute_only_pkey = -1; } - #endif - return init_new_context_ldt(tsk, mm); +#endif + init_new_context_ldt(mm); + return 0; } static inline void destroy_context(struct mm_struct *mm) { @@ -176,10 +185,10 @@ do { \ } while (0) #endif -static inline void arch_dup_mmap(struct mm_struct *oldmm, - struct mm_struct *mm) +static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm) { paravirt_arch_dup_mmap(oldmm, mm); + return ldt_dup_context(oldmm, mm); } static inline void arch_exit_mmap(struct mm_struct *mm) @@ -281,33 +290,6 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, return __pkru_allows_pkey(vma_pkey(vma), write); } -/* - * If PCID is on, ASID-aware code paths put the ASID+1 into the PCID - * bits. This serves two purposes. It prevents a nasty situation in - * which PCID-unaware code saves CR3, loads some other value (with PCID - * == 0), and then restores CR3, thus corrupting the TLB for ASID 0 if - * the saved ASID was nonzero. It also means that any bugs involving - * loading a PCID-enabled CR3 with CR4.PCIDE off will trigger - * deterministically. - */ - -static inline unsigned long build_cr3(struct mm_struct *mm, u16 asid) -{ - if (static_cpu_has(X86_FEATURE_PCID)) { - VM_WARN_ON_ONCE(asid > 4094); - return __sme_pa(mm->pgd) | (asid + 1); - } else { - VM_WARN_ON_ONCE(asid != 0); - return __sme_pa(mm->pgd); - } -} - -static inline unsigned long build_cr3_noflush(struct mm_struct *mm, u16 asid) -{ - VM_WARN_ON_ONCE(asid > 4094); - return __sme_pa(mm->pgd) | (asid + 1) | CR3_NOFLUSH; -} - /* * This can be used from process context to figure out what the value of * CR3 is without needing to do a (slow) __read_cr3(). @@ -317,7 +299,7 @@ static inline unsigned long build_cr3_noflush(struct mm_struct *mm, u16 asid) */ static inline unsigned long __get_current_cr3_fast(void) { - unsigned long cr3 = build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm), + unsigned long cr3 = build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, this_cpu_read(cpu_tlbstate.loaded_mm_asid)); /* For now, be very restrictive about when this can be called. */ diff --git a/arch/x86/include/asm/pgtable_32_types.h b/arch/x86/include/asm/pgtable_32_types.h index f2ca9b28fd68..ce245b0cdfca 100644 --- a/arch/x86/include/asm/pgtable_32_types.h +++ b/arch/x86/include/asm/pgtable_32_types.h @@ -38,13 +38,22 @@ extern bool __vmalloc_start_set; /* set once high_memory is set */ #define LAST_PKMAP 1024 #endif -#define PKMAP_BASE ((FIXADDR_START - PAGE_SIZE * (LAST_PKMAP + 1)) \ - & PMD_MASK) +/* + * Define this here and validate with BUILD_BUG_ON() in pgtable_32.c + * to avoid include recursion hell + */ +#define CPU_ENTRY_AREA_PAGES (NR_CPUS * 40) + +#define CPU_ENTRY_AREA_BASE \ + ((FIXADDR_START - PAGE_SIZE * (CPU_ENTRY_AREA_PAGES + 1)) & PMD_MASK) + +#define PKMAP_BASE \ + ((CPU_ENTRY_AREA_BASE - PAGE_SIZE) & PMD_MASK) #ifdef CONFIG_HIGHMEM # define VMALLOC_END (PKMAP_BASE - 2 * PAGE_SIZE) #else -# define VMALLOC_END (FIXADDR_START - 2 * PAGE_SIZE) +# define VMALLOC_END (CPU_ENTRY_AREA_BASE - 2 * PAGE_SIZE) #endif #define MODULES_VADDR VMALLOC_START diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h index 6d5f45dcd4a1..3d27831bc58d 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -76,32 +76,41 @@ typedef struct { pteval_t pte; } pte_t; #define PGDIR_MASK (~(PGDIR_SIZE - 1)) /* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */ -#define MAXMEM _AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL) +#define MAXMEM _AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL) + #ifdef CONFIG_X86_5LEVEL -#define VMALLOC_SIZE_TB _AC(16384, UL) -#define __VMALLOC_BASE _AC(0xff92000000000000, UL) -#define __VMEMMAP_BASE _AC(0xffd4000000000000, UL) +# define VMALLOC_SIZE_TB _AC(16384, UL) +# define __VMALLOC_BASE _AC(0xff92000000000000, UL) +# define __VMEMMAP_BASE _AC(0xffd4000000000000, UL) #else -#define VMALLOC_SIZE_TB _AC(32, UL) -#define __VMALLOC_BASE _AC(0xffffc90000000000, UL) -#define __VMEMMAP_BASE _AC(0xffffea0000000000, UL) +# define VMALLOC_SIZE_TB _AC(32, UL) +# define __VMALLOC_BASE _AC(0xffffc90000000000, UL) +# define __VMEMMAP_BASE _AC(0xffffea0000000000, UL) #endif + #ifdef CONFIG_RANDOMIZE_MEMORY -#define VMALLOC_START vmalloc_base -#define VMEMMAP_START vmemmap_base +# define VMALLOC_START vmalloc_base +# define VMEMMAP_START vmemmap_base #else -#define VMALLOC_START __VMALLOC_BASE -#define VMEMMAP_START __VMEMMAP_BASE +# define VMALLOC_START __VMALLOC_BASE +# define VMEMMAP_START __VMEMMAP_BASE #endif /* CONFIG_RANDOMIZE_MEMORY */ -#define VMALLOC_END (VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 1, UL)) -#define MODULES_VADDR (__START_KERNEL_map + KERNEL_IMAGE_SIZE) + +#define VMALLOC_END (VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 1, UL)) + +#define MODULES_VADDR (__START_KERNEL_map + KERNEL_IMAGE_SIZE) /* The module sections ends with the start of the fixmap */ -#define MODULES_END __fix_to_virt(__end_of_fixed_addresses + 1) -#define MODULES_LEN (MODULES_END - MODULES_VADDR) -#define ESPFIX_PGD_ENTRY _AC(-2, UL) -#define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << P4D_SHIFT) -#define EFI_VA_START ( -4 * (_AC(1, UL) << 30)) -#define EFI_VA_END (-68 * (_AC(1, UL) << 30)) +#define MODULES_END __fix_to_virt(__end_of_fixed_addresses + 1) +#define MODULES_LEN (MODULES_END - MODULES_VADDR) + +#define ESPFIX_PGD_ENTRY _AC(-2, UL) +#define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << P4D_SHIFT) + +#define CPU_ENTRY_AREA_PGD _AC(-3, UL) +#define CPU_ENTRY_AREA_BASE (CPU_ENTRY_AREA_PGD << P4D_SHIFT) + +#define EFI_VA_START ( -4 * (_AC(1, UL) << 30)) +#define EFI_VA_END (-68 * (_AC(1, UL) << 30)) #define EARLY_DYNAMIC_PAGE_TABLES 64 diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index da943411d3d8..9e482d8b0b97 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -336,12 +336,12 @@ struct x86_hw_tss { #define IO_BITMAP_OFFSET (offsetof(struct tss_struct, io_bitmap) - offsetof(struct tss_struct, x86_tss)) #define INVALID_IO_BITMAP_OFFSET 0x8000 -struct SYSENTER_stack { +struct entry_stack { unsigned long words[64]; }; -struct SYSENTER_stack_page { - struct SYSENTER_stack stack; +struct entry_stack_page { + struct entry_stack stack; } __aligned(PAGE_SIZE); struct tss_struct { diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h index f8062bfd43a0..f73706878772 100644 --- a/arch/x86/include/asm/stacktrace.h +++ b/arch/x86/include/asm/stacktrace.h @@ -16,7 +16,7 @@ enum stack_type { STACK_TYPE_TASK, STACK_TYPE_IRQ, STACK_TYPE_SOFTIRQ, - STACK_TYPE_SYSENTER, + STACK_TYPE_ENTRY, STACK_TYPE_EXCEPTION, STACK_TYPE_EXCEPTION_LAST = STACK_TYPE_EXCEPTION + N_EXCEPTION_STACKS-1, }; @@ -29,7 +29,7 @@ struct stack_info { bool in_task_stack(unsigned long *stack, struct task_struct *task, struct stack_info *info); -bool in_sysenter_stack(unsigned long *stack, struct stack_info *info); +bool in_entry_stack(unsigned long *stack, struct stack_info *info); int get_stack_info(unsigned long *stack, struct task_struct *task, struct stack_info *info, unsigned long *visit_mask); diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 509046cfa5ce..171b429f43a2 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -9,70 +9,66 @@ #include <asm/cpufeature.h> #include <asm/special_insns.h> #include <asm/smp.h> +#include <asm/invpcid.h> -static inline void __invpcid(unsigned long pcid, unsigned long addr, - unsigned long type) +static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) { - struct { u64 d[2]; } desc = { { pcid, addr } }; - /* - * The memory clobber is because the whole point is to invalidate - * stale TLB entries and, especially if we're flushing global - * mappings, we don't want the compiler to reorder any subsequent - * memory accesses before the TLB flush. - * - * The hex opcode is invpcid (%ecx), %eax in 32-bit mode and - * invpcid (%rcx), %rax in long mode. + * Bump the generation count. This also serves as a full barrier + * that synchronizes with switch_mm(): callers are required to order + * their read of mm_cpumask after their writes to the paging + * structures. */ - asm volatile (".byte 0x66, 0x0f, 0x38, 0x82, 0x01" - : : "m" (desc), "a" (type), "c" (&desc) : "memory"); + return atomic64_inc_return(&mm->context.tlb_gen); } -#define INVPCID_TYPE_INDIV_ADDR 0 -#define INVPCID_TYPE_SINGLE_CTXT 1 -#define INVPCID_TYPE_ALL_INCL_GLOBAL 2 -#define INVPCID_TYPE_ALL_NON_GLOBAL 3 - -/* Flush all mappings for a given pcid and addr, not including globals. */ -static inline void invpcid_flush_one(unsigned long pcid, - unsigned long addr) -{ - __invpcid(pcid, addr, INVPCID_TYPE_INDIV_ADDR); -} +/* There are 12 bits of space for ASIDS in CR3 */ +#define CR3_HW_ASID_BITS 12 +/* + * When enabled, PAGE_TABLE_ISOLATION consumes a single bit for + * user/kernel switches + */ +#define PTI_CONSUMED_ASID_BITS 0 -/* Flush all mappings for a given PCID, not including globals. */ -static inline void invpcid_flush_single_context(unsigned long pcid) -{ - __invpcid(pcid, 0, INVPCID_TYPE_SINGLE_CTXT); -} +#define CR3_AVAIL_ASID_BITS (CR3_HW_ASID_BITS - PTI_CONSUMED_ASID_BITS) +/* + * ASIDs are zero-based: 0->MAX_AVAIL_ASID are valid. -1 below to account + * for them being zero-based. Another -1 is because ASID 0 is reserved for + * use by non-PCID-aware users. + */ +#define MAX_ASID_AVAILABLE ((1 << CR3_AVAIL_ASID_BITS) - 2) -/* Flush all mappings, including globals, for all PCIDs. */ -static inline void invpcid_flush_all(void) +static inline u16 kern_pcid(u16 asid) { - __invpcid(0, 0, INVPCID_TYPE_ALL_INCL_GLOBAL); + VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE); + /* + * If PCID is on, ASID-aware code paths put the ASID+1 into the + * PCID bits. This serves two purposes. It prevents a nasty + * situation in which PCID-unaware code saves CR3, loads some other + * value (with PCID == 0), and then restores CR3, thus corrupting + * the TLB for ASID 0 if the saved ASID was nonzero. It also means + * that any bugs involving loading a PCID-enabled CR3 with + * CR4.PCIDE off will trigger deterministically. + */ + return asid + 1; } -/* Flush all mappings for all PCIDs except globals. */ -static inline void invpcid_flush_all_nonglobals(void) +struct pgd_t; +static inline unsigned long build_cr3(pgd_t *pgd, u16 asid) { - __invpcid(0, 0, INVPCID_TYPE_ALL_NON_GLOBAL); + if (static_cpu_has(X86_FEATURE_PCID)) { + return __sme_pa(pgd) | kern_pcid(asid); + } else { + VM_WARN_ON_ONCE(asid != 0); + return __sme_pa(pgd); + } } -static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) +static inline unsigned long build_cr3_noflush(pgd_t *pgd, u16 asid) { - u64 new_tlb_gen; - - /* - * Bump the generation count. This also serves as a full barrier - * that synchronizes with switch_mm(): callers are required to order - * their read of mm_cpumask after their writes to the paging - * structures. - */ - smp_mb__before_atomic(); - new_tlb_gen = atomic64_inc_return(&mm->context.tlb_gen); - smp_mb__after_atomic(); - - return new_tlb_gen; + VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE); + VM_WARN_ON_ONCE(!this_cpu_has(X86_FEATURE_PCID)); + return __sme_pa(pgd) | kern_pcid(asid) | CR3_NOFLUSH; } #ifdef CONFIG_PARAVIRT @@ -234,6 +230,9 @@ static inline void cr4_set_bits_and_update_boot(unsigned long mask) extern void initialize_tlbstate_and_flush(void); +/* + * flush the entire current user mapping + */ static inline void __native_flush_tlb(void) { /* @@ -246,20 +245,12 @@ static inline void __native_flush_tlb(void) preempt_enable(); } -static inline void __native_flush_tlb_global_irq_disabled(void) -{ - unsigned long cr4; - - cr4 = this_cpu_read(cpu_tlbstate.cr4); - /* clear PGE */ - native_write_cr4(cr4 & ~X86_CR4_PGE); - /* write old PGE again and flush TLBs */ - native_write_cr4(cr4); -} - +/* + * flush everything + */ static inline void __native_flush_tlb_global(void) { - unsigned long flags; + unsigned long cr4, flags; if (static_cpu_has(X86_FEATURE_INVPCID)) { /* @@ -277,22 +268,36 @@ static inline void __native_flush_tlb_global(void) */ raw_local_irq_save(flags); - __native_flush_tlb_global_irq_disabled(); + cr4 = this_cpu_read(cpu_tlbstate.cr4); + /* toggle PGE */ + native_write_cr4(cr4 ^ X86_CR4_PGE); + /* write old PGE again and flush TLBs */ + native_write_cr4(cr4); raw_local_irq_restore(flags); } +/* + * flush one page in the user mapping + */ static inline void __native_flush_tlb_single(unsigned long addr) { asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); } +/* + * flush everything + */ static inline void __flush_tlb_all(void) { - if (boot_cpu_has(X86_FEATURE_PGE)) + if (boot_cpu_has(X86_FEATURE_PGE)) { __flush_tlb_global(); - else + } else { + /* + * !PGE -> !PCID (setup_pcid()), thus every flush is total. + */ __flush_tlb(); + } /* * Note: if we somehow had PCID but not PGE, then this wouldn't work -- @@ -303,6 +308,9 @@ static inline void __flush_tlb_all(void) */ } +/* + * flush one page in the kernel mapping + */ static inline void __flush_tlb_one(unsigned long addr) { count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ONE); diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c index cd360a5e0dca..676b7cf4b62b 100644 --- a/arch/x86/kernel/asm-offsets.c +++ b/arch/x86/kernel/asm-offsets.c @@ -97,6 +97,6 @@ void common(void) { /* Layout info for cpu_entry_area */ OFFSET(CPU_ENTRY_AREA_tss, cpu_entry_area, tss); OFFSET(CPU_ENTRY_AREA_entry_trampoline, cpu_entry_area, entry_trampoline); - OFFSET(CPU_ENTRY_AREA_SYSENTER_stack, cpu_entry_area, SYSENTER_stack_page); - DEFINE(SIZEOF_SYSENTER_stack, sizeof(struct SYSENTER_stack)); + OFFSET(CPU_ENTRY_AREA_entry_stack, cpu_entry_area, entry_stack_page); + DEFINE(SIZEOF_entry_stack, sizeof(struct entry_stack)); } diff --git a/arch/x86/kernel/asm-offsets_32.c b/arch/x86/kernel/asm-offsets_32.c index 7d20d9c0b3d6..fa1261eefa16 100644 --- a/arch/x86/kernel/asm-offsets_32.c +++ b/arch/x86/kernel/asm-offsets_32.c @@ -48,7 +48,7 @@ void foo(void) /* Offset from the sysenter stack to tss.sp0 */ DEFINE(TSS_sysenter_sp0, offsetof(struct cpu_entry_area, tss.x86_tss.sp0) - - offsetofend(struct cpu_entry_area, SYSENTER_stack_page.stack)); + offsetofend(struct cpu_entry_area, entry_stack_page.stack)); #ifdef CONFIG_CC_STACKPROTECTOR BLANK(); diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 034900623adf..8ddcfa4d4165 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -482,102 +482,8 @@ static const unsigned int exception_stack_sizes[N_EXCEPTION_STACKS] = { [0 ... N_EXCEPTION_STACKS - 1] = EXCEPTION_STKSZ, [DEBUG_STACK - 1] = DEBUG_STKSZ }; - -static DEFINE_PER_CPU_PAGE_ALIGNED(char, exception_stacks - [(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]); -#endif - -static DEFINE_PER_CPU_PAGE_ALIGNED(struct SYSENTER_stack_page, - SYSENTER_stack_storage); - -static void __init -set_percpu_fixmap_pages(int idx, void *ptr, int pages, pgprot_t prot) -{ - for ( ; pages; pages--, idx--, ptr += PAGE_SIZE) - __set_fixmap(idx, per_cpu_ptr_to_phys(ptr), prot); -} - -/* Setup the fixmap mappings only once per-processor */ -static void __init setup_cpu_entry_area(int cpu) -{ -#ifdef CONFIG_X86_64 - extern char _entry_trampoline[]; - - /* On 64-bit systems, we use a read-only fixmap GDT and TSS. */ - pgprot_t gdt_prot = PAGE_KERNEL_RO; - pgprot_t tss_prot = PAGE_KERNEL_RO; -#else - /* - * On native 32-bit systems, the GDT cannot be read-only because - * our double fault handler uses a task gate, and entering through - * a task gate needs to change an available TSS to busy. If the - * GDT is read-only, that will triple fault. The TSS cannot be - * read-only because the CPU writes to it on task switches. - * - * On Xen PV, the GDT must be read-only because the hypervisor - * requires it. - */ - pgprot_t gdt_prot = boot_cpu_has(X86_FEATURE_XENPV) ? - PAGE_KERNEL_RO : PAGE_KERNEL; - pgprot_t tss_prot = PAGE_KERNEL; -#endif - - __set_fixmap(get_cpu_entry_area_index(cpu, gdt), get_cpu_gdt_paddr(cpu), gdt_prot); - set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, SYSENTER_stack_page), - per_cpu_ptr(&SYSENTER_stack_storage, cpu), 1, - PAGE_KERNEL); - - /* - * The Intel SDM says (Volume 3, 7.2.1): - * - * Avoid placing a page boundary in the part of the TSS that the - * processor reads during a task switch (the first 104 bytes). The - * processor may not correctly perform address translations if a - * boundary occurs in this area. During a task switch, the processor - * reads and writes into the first 104 bytes of each TSS (using - * contiguous physical addresses beginning with the physical address - * of the first byte of the TSS). So, after TSS access begins, if - * part of the 104 bytes is not physically contiguous, the processor - * will access incorrect information without generating a page-fault - * exception. - * - * There are also a lot of errata involving the TSS spanning a page - * boundary. Assert that we're not doing that. - */ - BUILD_BUG_ON((offsetof(struct tss_struct, x86_tss) ^ - offsetofend(struct tss_struct, x86_tss)) & PAGE_MASK); - BUILD_BUG_ON(sizeof(struct tss_struct) % PAGE_SIZE != 0); - set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, tss), - &per_cpu(cpu_tss_rw, cpu), - sizeof(struct tss_struct) / PAGE_SIZE, - tss_prot); - -#ifdef CONFIG_X86_32 - per_cpu(cpu_entry_area, cpu) = get_cpu_entry_area(cpu); #endif -#ifdef CONFIG_X86_64 - BUILD_BUG_ON(sizeof(exception_stacks) % PAGE_SIZE != 0); - BUILD_BUG_ON(sizeof(exception_stacks) != - sizeof(((struct cpu_entry_area *)0)->exception_stacks)); - set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, exception_stacks), - &per_cpu(exception_stacks, cpu), - sizeof(exception_stacks) / PAGE_SIZE, - PAGE_KERNEL); - - __set_fixmap(get_cpu_entry_area_index(cpu, entry_trampoline), - __pa_symbol(_entry_trampoline), PAGE_KERNEL_RX); -#endif -} - -void __init setup_cpu_entry_areas(void) -{ - unsigned int cpu; - - for_each_possible_cpu(cpu) - setup_cpu_entry_area(cpu); -} - /* Load the original GDT from the per-cpu structure */ void load_direct_gdt(int cpu) { @@ -1323,7 +1229,7 @@ void enable_sep_cpu(void) tss->x86_tss.ss1 = __KERNEL_CS; wrmsr(MSR_IA32_SYSENTER_CS, tss->x86_tss.ss1, 0); - wrmsr(MSR_IA32_SYSENTER_ESP, (unsigned long)(cpu_SYSENTER_stack(cpu) + 1), 0); + wrmsr(MSR_IA32_SYSENTER_ESP, (unsigned long)(cpu_entry_stack(cpu) + 1), 0); wrmsr(MSR_IA32_SYSENTER_EIP, (unsigned long)entry_SYSENTER_32, 0); put_cpu(); @@ -1440,7 +1346,7 @@ void syscall_init(void) * AMD doesn't allow SYSENTER in long mode (either 32- or 64-bit). */ wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS); - wrmsrl_safe(MSR_IA32_SYSENTER_ESP, (unsigned long)(cpu_SYSENTER_stack(cpu) + 1)); + wrmsrl_safe(MSR_IA32_SYSENTER_ESP, (unsigned long)(cpu_entry_stack(cpu) + 1)); wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat); #else wrmsrl(MSR_CSTAR, (unsigned long)ignore_sysret); @@ -1655,7 +1561,7 @@ void cpu_init(void) */ set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss); load_TR_desc(); - load_sp0((unsigned long)(cpu_SYSENTER_stack(cpu) + 1)); + load_sp0((unsigned long)(cpu_entry_stack(cpu) + 1)); load_mm_ldt(&init_mm); diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c index 7dbcb7adf797..8ccdca6d3f9e 100644 --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -565,15 +565,6 @@ static void print_ucode(struct ucode_cpu_info *uci) } #else -/* - * Flush global tlb. We only do this in x86_64 where paging has been enabled - * already and PGE should be enabled as well. - */ -static inline void flush_tlb_early(void) -{ - __native_flush_tlb_global_irq_disabled(); -} - static inline void print_ucode(struct ucode_cpu_info *uci) { struct microcode_intel *mc; @@ -602,10 +593,6 @@ static int apply_microcode_early(struct ucode_cpu_info *uci, bool early) if (rev != mc->hdr.rev) return -1; -#ifdef CONFIG_X86_64 - /* Flush global tlb. This is precaution. */ - flush_tlb_early(); -#endif uci->cpu_sig.rev = rev; if (early) diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c index bbd6d986e2d0..36b17e0febe8 100644 --- a/arch/x86/kernel/dumpstack.c +++ b/arch/x86/kernel/dumpstack.c @@ -18,6 +18,7 @@ #include <linux/nmi.h> #include <linux/sysfs.h> +#include <asm/cpu_entry_area.h> #include <asm/stacktrace.h> #include <asm/unwind.h> @@ -43,9 +44,9 @@ bool in_task_stack(unsigned long *stack, struct task_struct *task, return true; } -bool in_sysenter_stack(unsigned long *stack, struct stack_info *info) +bool in_entry_stack(unsigned long *stack, struct stack_info *info) { - struct SYSENTER_stack *ss = cpu_SYSENTER_stack(smp_processor_id()); + struct entry_stack *ss = cpu_entry_stack(smp_processor_id()); void *begin = ss; void *end = ss + 1; @@ -53,7 +54,7 @@ bool in_sysenter_stack(unsigned long *stack, struct stack_info *info) if ((void *)stack < begin || (void *)stack >= end) return false; - info->type = STACK_TYPE_SYSENTER; + info->type = STACK_TYPE_ENTRY; info->begin = begin; info->end = end; info->next_sp = NULL; @@ -111,13 +112,13 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, * - task stack * - interrupt stack * - HW exception stacks (double fault, nmi, debug, mce) - * - SYSENTER stack + * - entry stack * * x86-32 can have up to four stacks: * - task stack * - softirq stack * - hardirq stack - * - SYSENTER stack + * - entry stack */ for (regs = NULL; stack; stack = PTR_ALIGN(stack_info.next_sp, sizeof(long))) { const char *stack_name; diff --git a/arch/x86/kernel/dumpstack_32.c b/arch/x86/kernel/dumpstack_32.c index 5ff13a6b3680..04170f63e3a1 100644 --- a/arch/x86/kernel/dumpstack_32.c +++ b/arch/x86/kernel/dumpstack_32.c @@ -26,8 +26,8 @@ const char *stack_type_name(enum stack_type type) if (type == STACK_TYPE_SOFTIRQ) return "SOFTIRQ"; - if (type == STACK_TYPE_SYSENTER) - return "SYSENTER"; + if (type == STACK_TYPE_ENTRY) + return "ENTRY_TRAMPOLINE"; return NULL; } @@ -96,7 +96,7 @@ int get_stack_info(unsigned long *stack, struct task_struct *task, if (task != current) goto unknown; - if (in_sysenter_stack(stack, info)) + if (in_entry_stack(stack, info)) goto recursion_check; if (in_hardirq_stack(stack, info)) diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c index abc828f8c297..563e28d14f2c 100644 --- a/arch/x86/kernel/dumpstack_64.c +++ b/arch/x86/kernel/dumpstack_64.c @@ -37,8 +37,14 @@ const char *stack_type_name(enum stack_type type) if (type == STACK_TYPE_IRQ) return "IRQ"; - if (type == STACK_TYPE_SYSENTER) - return "SYSENTER"; + if (type == STACK_TYPE_ENTRY) { + /* + * On 64-bit, we have a generic entry stack that we + * use for all the kernel entry points, including + * SYSENTER. + */ + return "ENTRY_TRAMPOLINE"; + } if (type >= STACK_TYPE_EXCEPTION && type <= STACK_TYPE_EXCEPTION_LAST) return exception_stack_names[type - STACK_TYPE_EXCEPTION]; @@ -118,7 +124,7 @@ int get_stack_info(unsigned long *stack, struct task_struct *task, if (in_irq_stack(stack, info)) goto recursion_check; - if (in_sysenter_stack(stack, info)) + if (in_entry_stack(stack, info)) goto recursion_check; goto unknown; diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c index 1c1eae961340..a6b5d62f45a7 100644 --- a/arch/x86/kernel/ldt.c +++ b/arch/x86/kernel/ldt.c @@ -5,6 +5,11 @@ * Copyright (C) 2002 Andi Kleen * * This handles calls from both 32bit and 64bit mode. + * + * Lock order: + * contex.ldt_usr_sem + * mmap_sem + * context.lock */ #include <linux/errno.h> @@ -42,7 +47,7 @@ static void refresh_ldt_segments(void) #endif } -/* context.lock is held for us, so we don't need any locking. */ +/* context.lock is held by the task which issued the smp function call */ static void flush_ldt(void *__mm) { struct mm_struct *mm = __mm; @@ -99,15 +104,17 @@ static void finalize_ldt_struct(struct ldt_struct *ldt) paravirt_alloc_ldt(ldt->entries, ldt->nr_entries); } -/* context.lock is held */ -static void install_ldt(struct mm_struct *current_mm, - struct ldt_struct *ldt) +static void install_ldt(struct mm_struct *mm, struct ldt_struct *ldt) { + mutex_lock(&mm->context.lock); + /* Synchronizes with READ_ONCE in load_mm_ldt. */ - smp_store_release(&current_mm->context.ldt, ldt); + smp_store_release(&mm->context.ldt, ldt); - /* Activate the LDT for all CPUs using current_mm. */ - on_each_cpu_mask(mm_cpumask(current_mm), flush_ldt, current_mm, true); + /* Activate the LDT for all CPUs using currents mm. */ + on_each_cpu_mask(mm_cpumask(mm), flush_ldt, mm, true); + + mutex_unlock(&mm->context.lock); } static void free_ldt_struct(struct ldt_struct *ldt) @@ -124,27 +131,20 @@ static void free_ldt_struct(struct ldt_struct *ldt) } /* - * we do not have to muck with descriptors here, that is - * done in switch_mm() as needed. + * Called on fork from arch_dup_mmap(). Just copy the current LDT state, + * the new task is not running, so nothing can be installed. */ -int init_new_context_ldt(struct task_struct *tsk, struct mm_struct *mm) +int ldt_dup_context(struct mm_struct *old_mm, struct mm_struct *mm) { struct ldt_struct *new_ldt; - struct mm_struct *old_mm; int retval = 0; - mutex_init(&mm->context.lock); - old_mm = current->mm; - if (!old_mm) { - mm->context.ldt = NULL; + if (!old_mm) return 0; - } mutex_lock(&old_mm->context.lock); - if (!old_mm->context.ldt) { - mm->context.ldt = NULL; + if (!old_mm->context.ldt) goto out_unlock; - } new_ldt = alloc_ldt_struct(old_mm->context.ldt->nr_entries); if (!new_ldt) { @@ -180,7 +180,7 @@ static int read_ldt(void __user *ptr, unsigned long bytecount) unsigned long entries_size; int retval; - mutex_lock(&mm->context.lock); + down_read(&mm->context.ldt_usr_sem); if (!mm->context.ldt) { retval = 0; @@ -209,7 +209,7 @@ static int read_ldt(void __user *ptr, unsigned long bytecount) retval = bytecount; out_unlock: - mutex_unlock(&mm->context.lock); + up_read(&mm->context.ldt_usr_sem); return retval; } @@ -269,7 +269,8 @@ static int write_ldt(void __user *ptr, unsigned long bytecount, int oldmode) ldt.avl = 0; } - mutex_lock(&mm->context.lock); + if (down_write_killable(&mm->context.ldt_usr_sem)) + return -EINTR; old_ldt = mm->context.ldt; old_nr_entries = old_ldt ? old_ldt->nr_entries : 0; @@ -291,7 +292,7 @@ static int write_ldt(void __user *ptr, unsigned long bytecount, int oldmode) error = 0; out_unlock: - mutex_unlock(&mm->context.lock); + up_write(&mm->context.ldt_usr_sem); out: return error; } diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 142126ab5aae..12bf07d44dfe 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -990,12 +990,8 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle, initial_code = (unsigned long)start_secondary; initial_stack = idle->thread.sp; - /* - * Enable the espfix hack for this CPU - */ -#ifdef CONFIG_X86_ESPFIX64 + /* Enable the espfix hack for this CPU */ init_espfix_ap(cpu); -#endif /* So we see what's up */ announce_cpu(cpu, apicid); diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 74136fd16f49..7c16fe0b60c2 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -52,6 +52,7 @@ #include <asm/traps.h> #include <asm/desc.h> #include <asm/fpu/internal.h> +#include <asm/cpu_entry_area.h> #include <asm/mce.h> #include <asm/fixmap.h> #include <asm/mach_traps.h> @@ -950,8 +951,9 @@ void __init trap_init(void) * "sidt" instruction will not leak the location of the kernel, and * to defend the IDT against arbitrary memory write vulnerabilities. * It will be reloaded in cpu_init() */ - __set_fixmap(FIX_RO_IDT, __pa_symbol(idt_table), PAGE_KERNEL_RO); - idt_descr.address = fix_to_virt(FIX_RO_IDT); + cea_set_pte(CPU_ENTRY_AREA_RO_IDT_VADDR, __pa_symbol(idt_table), + PAGE_KERNEL_RO); + idt_descr.address = CPU_ENTRY_AREA_RO_IDT; /* * Should be a barrier for any external CPU state: diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index d90cdc77e077..7bbb5da2b49d 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -2404,9 +2404,21 @@ static int rsm_load_seg_64(struct x86_emulate_ctxt *ctxt, u64 smbase, int n) } static int rsm_enter_protected_mode(struct x86_emulate_ctxt *ctxt, - u64 cr0, u64 cr4) + u64 cr0, u64 cr3, u64 cr4) { int bad; + u64 pcid; + + /* In order to later set CR4.PCIDE, CR3[11:0] must be zero. */ + pcid = 0; + if (cr4 & X86_CR4_PCIDE) { + pcid = cr3 & 0xfff; + cr3 &= ~0xfff; + } + + bad = ctxt->ops->set_cr(ctxt, 3, cr3); + if (bad) + return X86EMUL_UNHANDLEABLE; /* * First enable PAE, long mode needs it before CR0.PG = 1 is set. @@ -2425,6 +2437,12 @@ static int rsm_enter_protected_mode(struct x86_emulate_ctxt *ctxt, bad = ctxt->ops->set_cr(ctxt, 4, cr4); if (bad) return X86EMUL_UNHANDLEABLE; + if (pcid) { + bad = ctxt->ops->set_cr(ctxt, 3, cr3 | pcid); + if (bad) + return X86EMUL_UNHANDLEABLE; + } + } return X86EMUL_CONTINUE; @@ -2435,11 +2453,11 @@ static int rsm_load_state_32(struct x86_emulate_ctxt *ctxt, u64 smbase) struct desc_struct desc; struct desc_ptr dt; u16 selector; - u32 val, cr0, cr4; + u32 val, cr0, cr3, cr4; int i; cr0 = GET_SMSTATE(u32, smbase, 0x7ffc); - ctxt->ops->set_cr(ctxt, 3, GET_SMSTATE(u32, smbase, 0x7ff8)); + cr3 = GET_SMSTATE(u32, smbase, 0x7ff8); ctxt->eflags = GET_SMSTATE(u32, smbase, 0x7ff4) | X86_EFLAGS_FIXED; ctxt->_eip = GET_SMSTATE(u32, smbase, 0x7ff0); @@ -2481,14 +2499,14 @@ static int rsm_load_state_32(struct x86_emulate_ctxt *ctxt, u64 smbase) ctxt->ops->set_smbase(ctxt, GET_SMSTATE(u32, smbase, 0x7ef8)); - return rsm_enter_protected_mode(ctxt, cr0, cr4); + return rsm_enter_protected_mode(ctxt, cr0, cr3, cr4); } static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt, u64 smbase) { struct desc_struct desc; struct desc_ptr dt; - u64 val, cr0, cr4; + u64 val, cr0, cr3, cr4; u32 base3; u16 selector; int i, r; @@ -2505,7 +2523,7 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt, u64 smbase) ctxt->ops->set_dr(ctxt, 7, (val & DR7_VOLATILE) | DR7_FIXED_1); cr0 = GET_SMSTATE(u64, smbase, 0x7f58); - ctxt->ops->set_cr(ctxt, 3, GET_SMSTATE(u64, smbase, 0x7f50)); + cr3 = GET_SMSTATE(u64, smbase, 0x7f50); cr4 = GET_SMSTATE(u64, smbase, 0x7f48); ctxt->ops->set_smbase(ctxt, GET_SMSTATE(u32, smbase, 0x7f00)); val = GET_SMSTATE(u64, smbase, 0x7ed0); @@ -2533,7 +2551,7 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt, u64 smbase) dt.address = GET_SMSTATE(u64, smbase, 0x7e68); ctxt->ops->set_gdt(ctxt, &dt); - r = rsm_enter_protected_mode(ctxt, cr0, cr4); + r = rsm_enter_protected_mode(ctxt, cr0, cr3, cr4); if (r != X86EMUL_CONTINUE) return r; diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 13ebeedcec07..0fce8d73403c 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3382,7 +3382,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) spin_lock(&vcpu->kvm->mmu_lock); if(make_mmu_pages_available(vcpu) < 0) { spin_unlock(&vcpu->kvm->mmu_lock); - return 1; + return -ENOSPC; } sp = kvm_mmu_get_page(vcpu, 0, 0, vcpu->arch.mmu.shadow_root_level, 1, ACC_ALL); @@ -3397,7 +3397,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) spin_lock(&vcpu->kvm->mmu_lock); if (make_mmu_pages_available(vcpu) < 0) { spin_unlock(&vcpu->kvm->mmu_lock); - return 1; + return -ENOSPC; } sp = kvm_mmu_get_page(vcpu, i << (30 - PAGE_SHIFT), i << 30, PT32_ROOT_LEVEL, 1, ACC_ALL); @@ -3437,7 +3437,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) spin_lock(&vcpu->kvm->mmu_lock); if (make_mmu_pages_available(vcpu) < 0) { spin_unlock(&vcpu->kvm->mmu_lock); - return 1; + return -ENOSPC; } sp = kvm_mmu_get_page(vcpu, root_gfn, 0, vcpu->arch.mmu.shadow_root_level, 0, ACC_ALL); @@ -3474,7 +3474,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) spin_lock(&vcpu->kvm->mmu_lock); if (make_mmu_pages_available(vcpu) < 0) { spin_unlock(&vcpu->kvm->mmu_lock); - return 1; + return -ENOSPC; } sp = kvm_mmu_get_page(vcpu, root_gfn, i << 30, PT32_ROOT_LEVEL, 0, ACC_ALL); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index df62cdc7a258..075619a92ce7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7359,7 +7359,7 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) #endif kvm_rip_write(vcpu, regs->rip); - kvm_set_rflags(vcpu, regs->rflags); + kvm_set_rflags(vcpu, regs->rflags | X86_EFLAGS_FIXED); vcpu->arch.exception.pending = false; diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt index c4d55919fac1..e0b85930dd77 100644 --- a/arch/x86/lib/x86-opcode-map.txt +++ b/arch/x86/lib/x86-opcode-map.txt @@ -607,7 +607,7 @@ fb: psubq Pq,Qq | vpsubq Vx,Hx,Wx (66),(v1) fc: paddb Pq,Qq | vpaddb Vx,Hx,Wx (66),(v1) fd: paddw Pq,Qq | vpaddw Vx,Hx,Wx (66),(v1) fe: paddd Pq,Qq | vpaddd Vx,Hx,Wx (66),(v1) -ff: +ff: UD0 EndTable Table: 3-byte opcode 1 (0x0f 0x38) @@ -717,7 +717,7 @@ AVXcode: 2 7e: vpermt2d/q Vx,Hx,Wx (66),(ev) 7f: vpermt2ps/d Vx,Hx,Wx (66),(ev) 80: INVEPT Gy,Mdq (66) -81: INVPID Gy,Mdq (66) +81: INVVPID Gy,Mdq (66) 82: INVPCID Gy,Mdq (66) 83: vpmultishiftqb Vx,Hx,Wx (66),(ev) 88: vexpandps/d Vpd,Wpd (66),(ev) @@ -970,6 +970,15 @@ GrpTable: Grp9 EndTable GrpTable: Grp10 +# all are UD1 +0: UD1 +1: UD1 +2: UD1 +3: UD1 +4: UD1 +5: UD1 +6: UD1 +7: UD1 EndTable # Grp11A and Grp11B are expressed as Grp11 in Intel SDM diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 7ba7f3d7f477..2e0017af8f9b 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -10,7 +10,7 @@ CFLAGS_REMOVE_mem_encrypt.o = -pg endif obj-y := init.o init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \ - pat.o pgtable.o physaddr.o setup_nx.o tlb.o + pat.o pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o # Make sure __phys_addr has no stackprotector nostackp := $(call cc-option, -fno-stack-protector) diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c new file mode 100644 index 000000000000..fe814fd5e014 --- /dev/null +++ b/arch/x86/mm/cpu_entry_area.c @@ -0,0 +1,139 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <linux/spinlock.h> +#include <linux/percpu.h> + +#include <asm/cpu_entry_area.h> +#include <asm/pgtable.h> +#include <asm/fixmap.h> +#include <asm/desc.h> + +static DEFINE_PER_CPU_PAGE_ALIGNED(struct entry_stack_page, entry_stack_storage); + +#ifdef CONFIG_X86_64 +static DEFINE_PER_CPU_PAGE_ALIGNED(char, exception_stacks + [(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]); +#endif + +struct cpu_entry_area *get_cpu_entry_area(int cpu) +{ + unsigned long va = CPU_ENTRY_AREA_PER_CPU + cpu * CPU_ENTRY_AREA_SIZE; + BUILD_BUG_ON(sizeof(struct cpu_entry_area) % PAGE_SIZE != 0); + + return (struct cpu_entry_area *) va; +} +EXPORT_SYMBOL(get_cpu_entry_area); + +void cea_set_pte(void *cea_vaddr, phys_addr_t pa, pgprot_t flags) +{ + unsigned long va = (unsigned long) cea_vaddr; + + set_pte_vaddr(va, pfn_pte(pa >> PAGE_SHIFT, flags)); +} + +static void __init +cea_map_percpu_pages(void *cea_vaddr, void *ptr, int pages, pgprot_t prot) +{ + for ( ; pages; pages--, cea_vaddr+= PAGE_SIZE, ptr += PAGE_SIZE) + cea_set_pte(cea_vaddr, per_cpu_ptr_to_phys(ptr), prot); +} + +/* Setup the fixmap mappings only once per-processor */ +static void __init setup_cpu_entry_area(int cpu) +{ +#ifdef CONFIG_X86_64 + extern char _entry_trampoline[]; + + /* On 64-bit systems, we use a read-only fixmap GDT and TSS. */ + pgprot_t gdt_prot = PAGE_KERNEL_RO; + pgprot_t tss_prot = PAGE_KERNEL_RO; +#else + /* + * On native 32-bit systems, the GDT cannot be read-only because + * our double fault handler uses a task gate, and entering through + * a task gate needs to change an available TSS to busy. If the + * GDT is read-only, that will triple fault. The TSS cannot be + * read-only because the CPU writes to it on task switches. + * + * On Xen PV, the GDT must be read-only because the hypervisor + * requires it. + */ + pgprot_t gdt_prot = boot_cpu_has(X86_FEATURE_XENPV) ? + PAGE_KERNEL_RO : PAGE_KERNEL; + pgprot_t tss_prot = PAGE_KERNEL; +#endif + + cea_set_pte(&get_cpu_entry_area(cpu)->gdt, get_cpu_gdt_paddr(cpu), + gdt_prot); + + cea_map_percpu_pages(&get_cpu_entry_area(cpu)->entry_stack_page, + per_cpu_ptr(&entry_stack_storage, cpu), 1, + PAGE_KERNEL); + + /* + * The Intel SDM says (Volume 3, 7.2.1): + * + * Avoid placing a page boundary in the part of the TSS that the + * processor reads during a task switch (the first 104 bytes). The + * processor may not correctly perform address translations if a + * boundary occurs in this area. During a task switch, the processor + * reads and writes into the first 104 bytes of each TSS (using + * contiguous physical addresses beginning with the physical address + * of the first byte of the TSS). So, after TSS access begins, if + * part of the 104 bytes is not physically contiguous, the processor + * will access incorrect information without generating a page-fault + * exception. + * + * There are also a lot of errata involving the TSS spanning a page + * boundary. Assert that we're not doing that. + */ + BUILD_BUG_ON((offsetof(struct tss_struct, x86_tss) ^ + offsetofend(struct tss_struct, x86_tss)) & PAGE_MASK); + BUILD_BUG_ON(sizeof(struct tss_struct) % PAGE_SIZE != 0); + cea_map_percpu_pages(&get_cpu_entry_area(cpu)->tss, + &per_cpu(cpu_tss_rw, cpu), + sizeof(struct tss_struct) / PAGE_SIZE, tss_prot); + +#ifdef CONFIG_X86_32 + per_cpu(cpu_entry_area, cpu) = get_cpu_entry_area(cpu); +#endif + +#ifdef CONFIG_X86_64 + BUILD_BUG_ON(sizeof(exception_stacks) % PAGE_SIZE != 0); + BUILD_BUG_ON(sizeof(exception_stacks) != + sizeof(((struct cpu_entry_area *)0)->exception_stacks)); + cea_map_percpu_pages(&get_cpu_entry_area(cpu)->exception_stacks, + &per_cpu(exception_stacks, cpu), + sizeof(exception_stacks) / PAGE_SIZE, PAGE_KERNEL); + + cea_set_pte(&get_cpu_entry_area(cpu)->entry_trampoline, + __pa_symbol(_entry_trampoline), PAGE_KERNEL_RX); +#endif +} + +static __init void setup_cpu_entry_area_ptes(void) +{ +#ifdef CONFIG_X86_32 + unsigned long start, end; + + BUILD_BUG_ON(CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE); + BUG_ON(CPU_ENTRY_AREA_BASE & ~PMD_MASK); + + start = CPU_ENTRY_AREA_BASE; + end = start + CPU_ENTRY_AREA_MAP_SIZE; + + /* Careful here: start + PMD_SIZE might wrap around */ + for (; start < end && start >= CPU_ENTRY_AREA_BASE; start += PMD_SIZE) + populate_extra_pte(start); +#endif +} + +void __init setup_cpu_entry_areas(void) +{ + unsigned int cpu; + + setup_cpu_entry_area_ptes(); + + for_each_possible_cpu(cpu) + setup_cpu_entry_area(cpu); +} diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c index 5e3ac6fe6c9e..43dedbfb7257 100644 --- a/arch/x86/mm/dump_pagetables.c +++ b/arch/x86/mm/dump_pagetables.c @@ -44,10 +44,12 @@ struct addr_marker { unsigned long max_lines; }; -/* indices for address_markers; keep sync'd w/ address_markers below */ +/* Address space markers hints */ + +#ifdef CONFIG_X86_64 + enum address_markers_idx { USER_SPACE_NR = 0, -#ifdef CONFIG_X86_64 KERNEL_SPACE_NR, LOW_KERNEL_NR, VMALLOC_START_NR, @@ -56,56 +58,74 @@ enum address_markers_idx { KASAN_SHADOW_START_NR, KASAN_SHADOW_END_NR, #endif -# ifdef CONFIG_X86_ESPFIX64 + CPU_ENTRY_AREA_NR, +#ifdef CONFIG_X86_ESPFIX64 ESPFIX_START_NR, -# endif +#endif +#ifdef CONFIG_EFI + EFI_END_NR, +#endif HIGH_KERNEL_NR, MODULES_VADDR_NR, MODULES_END_NR, -#else + FIXADDR_START_NR, + END_OF_SPACE_NR, +}; + +static struct addr_marker address_markers[] = { + [USER_SPACE_NR] = { 0, "User Space" }, + [KERNEL_SPACE_NR] = { (1UL << 63), "Kernel Space" }, + [LOW_KERNEL_NR] = { 0UL, "Low Kernel Mapping" }, + [VMALLOC_START_NR] = { 0UL, "vmalloc() Area" }, + [VMEMMAP_START_NR] = { 0UL, "Vmemmap" }, +#ifdef CONFIG_KASAN + [KASAN_SHADOW_START_NR] = { KASAN_SHADOW_START, "KASAN shadow" }, + [KASAN_SHADOW_END_NR] = { KASAN_SHADOW_END, "KASAN shadow end" }, +#endif + [CPU_ENTRY_AREA_NR] = { CPU_ENTRY_AREA_BASE,"CPU entry Area" }, +#ifdef CONFIG_X86_ESPFIX64 + [ESPFIX_START_NR] = { ESPFIX_BASE_ADDR, "ESPfix Area", 16 }, +#endif +#ifdef CONFIG_EFI + [EFI_END_NR] = { EFI_VA_END, "EFI Runtime Services" }, +#endif + [HIGH_KERNEL_NR] = { __START_KERNEL_map, "High Kernel Mapping" }, + [MODULES_VADDR_NR] = { MODULES_VADDR, "Modules" }, + [MODULES_END_NR] = { MODULES_END, "End Modules" }, + [FIXADDR_START_NR] = { FIXADDR_START, "Fixmap Area" }, + [END_OF_SPACE_NR] = { -1, NULL } +}; + +#else /* CONFIG_X86_64 */ + +enum address_markers_idx { + USER_SPACE_NR = 0, KERNEL_SPACE_NR, VMALLOC_START_NR, VMALLOC_END_NR, -# ifdef CONFIG_HIGHMEM +#ifdef CONFIG_HIGHMEM PKMAP_BASE_NR, -# endif - FIXADDR_START_NR, #endif + CPU_ENTRY_AREA_NR, + FIXADDR_START_NR, + END_OF_SPACE_NR, }; -/* Address space markers hints */ static struct addr_marker address_markers[] = { - { 0, "User Space" }, -#ifdef CONFIG_X86_64 - { 0x8000000000000000UL, "Kernel Space" }, - { 0/* PAGE_OFFSET */, "Low Kernel Mapping" }, - { 0/* VMALLOC_START */, "vmalloc() Area" }, - { 0/* VMEMMAP_START */, "Vmemmap" }, -#ifdef CONFIG_KASAN - { KASAN_SHADOW_START, "KASAN shadow" }, - { KASAN_SHADOW_END, "KASAN shadow end" }, + [USER_SPACE_NR] = { 0, "User Space" }, + [KERNEL_SPACE_NR] = { PAGE_OFFSET, "Kernel Mapping" }, + [VMALLOC_START_NR] = { 0UL, "vmalloc() Area" }, + [VMALLOC_END_NR] = { 0UL, "vmalloc() End" }, +#ifdef CONFIG_HIGHMEM + [PKMAP_BASE_NR] = { 0UL, "Persistent kmap() Area" }, #endif -# ifdef CONFIG_X86_ESPFIX64 - { ESPFIX_BASE_ADDR, "ESPfix Area", 16 }, -# endif -# ifdef CONFIG_EFI - { EFI_VA_END, "EFI Runtime Services" }, -# endif - { __START_KERNEL_map, "High Kernel Mapping" }, - { MODULES_VADDR, "Modules" }, - { MODULES_END, "End Modules" }, -#else - { PAGE_OFFSET, "Kernel Mapping" }, - { 0/* VMALLOC_START */, "vmalloc() Area" }, - { 0/*VMALLOC_END*/, "vmalloc() End" }, -# ifdef CONFIG_HIGHMEM - { 0/*PKMAP_BASE*/, "Persistent kmap() Area" }, -# endif - { 0/*FIXADDR_START*/, "Fixmap Area" }, -#endif - { -1, NULL } /* End of list */ + [CPU_ENTRY_AREA_NR] = { 0UL, "CPU entry area" }, + [FIXADDR_START_NR] = { 0UL, "Fixmap area" }, + [END_OF_SPACE_NR] = { -1, NULL } }; +#endif /* !CONFIG_X86_64 */ + /* Multipliers for offsets within the PTEs */ #define PTE_LEVEL_MULT (PAGE_SIZE) #define PMD_LEVEL_MULT (PTRS_PER_PTE * PTE_LEVEL_MULT) @@ -140,7 +160,7 @@ static void printk_prot(struct seq_file *m, pgprot_t prot, int level, bool dmsg) static const char * const level_name[] = { "cr3", "pgd", "p4d", "pud", "pmd", "pte" }; - if (!pgprot_val(prot)) { + if (!(pr & _PAGE_PRESENT)) { /* Not present */ pt_dump_cont_printf(m, dmsg, " "); } else { @@ -525,8 +545,8 @@ static int __init pt_dump_init(void) address_markers[PKMAP_BASE_NR].start_address = PKMAP_BASE; # endif address_markers[FIXADDR_START_NR].start_address = FIXADDR_START; + address_markers[CPU_ENTRY_AREA_NR].start_address = CPU_ENTRY_AREA_BASE; #endif - return 0; } __initcall(pt_dump_init); diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 8a64a6f2848d..135c9a7898c7 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -50,6 +50,7 @@ #include <asm/setup.h> #include <asm/set_memory.h> #include <asm/page_types.h> +#include <asm/cpu_entry_area.h> #include <asm/init.h> #include "mm_internal.h" @@ -766,6 +767,7 @@ void __init mem_init(void) mem_init_print_info(NULL); printk(KERN_INFO "virtual kernel memory layout:\n" " fixmap : 0x%08lx - 0x%08lx (%4ld kB)\n" + " cpu_entry : 0x%08lx - 0x%08lx (%4ld kB)\n" #ifdef CONFIG_HIGHMEM " pkmap : 0x%08lx - 0x%08lx (%4ld kB)\n" #endif @@ -777,6 +779,10 @@ void __init mem_init(void) FIXADDR_START, FIXADDR_TOP, (FIXADDR_TOP - FIXADDR_START) >> 10, + CPU_ENTRY_AREA_BASE, + CPU_ENTRY_AREA_BASE + CPU_ENTRY_AREA_MAP_SIZE, + CPU_ENTRY_AREA_MAP_SIZE >> 10, + #ifdef CONFIG_HIGHMEM PKMAP_BASE, PKMAP_BASE+LAST_PKMAP*PAGE_SIZE, (LAST_PKMAP*PAGE_SIZE) >> 10, diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c index 9ec70d780f1f..47388f0c0e59 100644 --- a/arch/x86/mm/kasan_init_64.c +++ b/arch/x86/mm/kasan_init_64.c @@ -15,6 +15,7 @@ #include <asm/tlbflush.h> #include <asm/sections.h> #include <asm/pgtable.h> +#include <asm/cpu_entry_area.h> extern struct range pfn_mapped[E820_MAX_ENTRIES]; @@ -322,31 +323,33 @@ void __init kasan_init(void) map_range(&pfn_mapped[i]); } - kasan_populate_zero_shadow( - kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM), - kasan_mem_to_shadow((void *)__START_KERNEL_map)); - - kasan_populate_shadow((unsigned long)kasan_mem_to_shadow(_stext), - (unsigned long)kasan_mem_to_shadow(_end), - early_pfn_to_nid(__pa(_stext))); - - shadow_cpu_entry_begin = (void *)__fix_to_virt(FIX_CPU_ENTRY_AREA_BOTTOM); + shadow_cpu_entry_begin = (void *)CPU_ENTRY_AREA_BASE; shadow_cpu_entry_begin = kasan_mem_to_shadow(shadow_cpu_entry_begin); shadow_cpu_entry_begin = (void *)round_down((unsigned long)shadow_cpu_entry_begin, PAGE_SIZE); - shadow_cpu_entry_end = (void *)(__fix_to_virt(FIX_CPU_ENTRY_AREA_TOP) + PAGE_SIZE); + shadow_cpu_entry_end = (void *)(CPU_ENTRY_AREA_BASE + + CPU_ENTRY_AREA_MAP_SIZE); shadow_cpu_entry_end = kasan_mem_to_shadow(shadow_cpu_entry_end); shadow_cpu_entry_end = (void *)round_up((unsigned long)shadow_cpu_entry_end, PAGE_SIZE); - kasan_populate_zero_shadow(kasan_mem_to_shadow((void *)MODULES_END), - shadow_cpu_entry_begin); + kasan_populate_zero_shadow( + kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM), + shadow_cpu_entry_begin); kasan_populate_shadow((unsigned long)shadow_cpu_entry_begin, (unsigned long)shadow_cpu_entry_end, 0); - kasan_populate_zero_shadow(shadow_cpu_entry_end, (void *)KASAN_SHADOW_END); + kasan_populate_zero_shadow(shadow_cpu_entry_end, + kasan_mem_to_shadow((void *)__START_KERNEL_map)); + + kasan_populate_shadow((unsigned long)kasan_mem_to_shadow(_stext), + (unsigned long)kasan_mem_to_shadow(_end), + early_pfn_to_nid(__pa(_stext))); + + kasan_populate_zero_shadow(kasan_mem_to_shadow((void *)MODULES_END), + (void *)KASAN_SHADOW_END); load_cr3(init_top_pgt); __flush_tlb_all(); diff --git a/arch/x86/mm/pgtable_32.c b/arch/x86/mm/pgtable_32.c index 6b9bf023a700..c3c5274410a9 100644 --- a/arch/x86/mm/pgtable_32.c +++ b/arch/x86/mm/pgtable_32.c @@ -10,6 +10,7 @@ #include <linux/pagemap.h> #include <linux/spinlock.h> +#include <asm/cpu_entry_area.h> #include <asm/pgtable.h> #include <asm/pgalloc.h> #include <asm/fixmap.h> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 3118392cdf75..0a1be3adc97e 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -128,7 +128,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, * isn't free. */ #ifdef CONFIG_DEBUG_VM - if (WARN_ON_ONCE(__read_cr3() != build_cr3(real_prev, prev_asid))) { + if (WARN_ON_ONCE(__read_cr3() != build_cr3(real_prev->pgd, prev_asid))) { /* * If we were to BUG here, we'd be very likely to kill * the system so hard that we don't see the call trace. @@ -195,7 +195,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, if (need_flush) { this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id); this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen); - write_cr3(build_cr3(next, new_asid)); + write_cr3(build_cr3(next->pgd, new_asid)); /* * NB: This gets called via leave_mm() in the idle path @@ -208,7 +208,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL); } else { /* The new ASID is already up to date. */ - write_cr3(build_cr3_noflush(next, new_asid)); + write_cr3(build_cr3_noflush(next->pgd, new_asid)); /* See above wrt _rcuidle. */ trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 0); @@ -288,7 +288,7 @@ void initialize_tlbstate_and_flush(void) !(cr4_read_shadow() & X86_CR4_PCIDE)); /* Force ASID 0 and force a TLB flush. */ - write_cr3(build_cr3(mm, 0)); + write_cr3(build_cr3(mm->pgd, 0)); /* Reinitialize tlbstate. */ this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0); @@ -551,7 +551,7 @@ static void do_kernel_range_flush(void *info) /* flush range by one by one 'invlpg' */ for (addr = f->start; addr < f->end; addr += PAGE_SIZE) - __flush_tlb_single(addr); + __flush_tlb_one(addr); } void flush_tlb_kernel_range(unsigned long start, unsigned long end) diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c index f44c0bc95aa2..8538a6723171 100644 --- a/arch/x86/platform/uv/tlb_uv.c +++ b/arch/x86/platform/uv/tlb_uv.c @@ -299,7 +299,7 @@ static void bau_process_message(struct msg_desc *mdp, struct bau_control *bcp, local_flush_tlb(); stat->d_alltlb++; } else { - __flush_tlb_one(msg->address); + __flush_tlb_single(msg->address); stat->d_onetlb++; } stat->d_requestee++; diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index c2454237fa67..a0e2b8c6e5c7 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -2261,7 +2261,6 @@ static void xen_set_fixmap(unsigned idx, phys_addr_t phys, pgprot_t prot) switch (idx) { case FIX_BTMAP_END ... FIX_BTMAP_BEGIN: - case FIX_RO_IDT: #ifdef CONFIG_X86_32 case FIX_WP_TEST: # ifdef CONFIG_HIGHMEM @@ -2272,7 +2271,6 @@ static void xen_set_fixmap(unsigned idx, phys_addr_t phys, pgprot_t prot) #endif case FIX_TEXT_POKE0: case FIX_TEXT_POKE1: - case FIX_CPU_ENTRY_AREA_TOP ... FIX_CPU_ENTRY_AREA_BOTTOM: /* All local page mappings */ pte = pfn_pte(phys, prot); break; diff --git a/block/bio.c b/block/bio.c index 33fa6b4af312..7f978eac9a7a 100644 --- a/block/bio.c +++ b/block/bio.c @@ -599,6 +599,8 @@ void __bio_clone_fast(struct bio *bio, struct bio *bio_src) bio->bi_disk = bio_src->bi_disk; bio->bi_partno = bio_src->bi_partno; bio_set_flag(bio, BIO_CLONED); + if (bio_flagged(bio_src, BIO_THROTTLED)) + bio_set_flag(bio, BIO_THROTTLED); bio->bi_opf = bio_src->bi_opf; bio->bi_write_hint = bio_src->bi_write_hint; bio->bi_iter = bio_src->bi_iter; diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 8631763866c6..a8cd7b3d9647 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -2223,13 +2223,7 @@ bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg, out_unlock: spin_unlock_irq(q->queue_lock); out: - /* - * As multiple blk-throtls may stack in the same issue path, we - * don't want bios to leave with the flag set. Clear the flag if - * being issued. - */ - if (!throttled) - bio_clear_flag(bio, BIO_THROTTLED); + bio_set_flag(bio, BIO_THROTTLED); #ifdef CONFIG_BLK_DEV_THROTTLING_LOW if (throttled || !td->track_bio_latency) diff --git a/crypto/af_alg.c b/crypto/af_alg.c index e181073ef64d..6ec360213107 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -1165,12 +1165,6 @@ int af_alg_get_rsgl(struct sock *sk, struct msghdr *msg, int flags, if (!af_alg_readable(sk)) break; - if (!ctx->used) { - err = af_alg_wait_for_data(sk, flags); - if (err) - return err; - } - seglen = min_t(size_t, (maxsize - len), msg_data_left(msg)); diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c index 3d793bc2aa82..782cb8fec323 100644 --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -111,6 +111,12 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, size_t usedpages = 0; /* [in] RX bufs to be used from user */ size_t processed = 0; /* [in] TX bufs to be consumed */ + if (!ctx->used) { + err = af_alg_wait_for_data(sk, flags); + if (err) + return err; + } + /* * Data length provided by caller via sendmsg/sendpage that has not * yet been processed. @@ -285,6 +291,10 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, /* AIO operation */ sock_hold(sk); areq->iocb = msg->msg_iocb; + + /* Remember output size that will be generated. */ + areq->outlen = outlen; + aead_request_set_callback(&areq->cra_u.aead_req, CRYPTO_TFM_REQ_MAY_BACKLOG, af_alg_async_cb, areq); @@ -292,12 +302,8 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, crypto_aead_decrypt(&areq->cra_u.aead_req); /* AIO operation in progress */ - if (err == -EINPROGRESS || err == -EBUSY) { - /* Remember output size that will be generated. */ - areq->outlen = outlen; - + if (err == -EINPROGRESS || err == -EBUSY) return -EIOCBQUEUED; - } sock_put(sk); } else { diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c index 30ee2a8e8f42..7a3e663d54d5 100644 --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -72,6 +72,12 @@ static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg, int err = 0; size_t len = 0; + if (!ctx->used) { + err = af_alg_wait_for_data(sk, flags); + if (err) + return err; + } + /* Allocate cipher request for current operation. */ areq = af_alg_alloc_areq(sk, sizeof(struct af_alg_async_req) + crypto_skcipher_reqsize(tfm)); @@ -119,6 +125,10 @@ static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg, /* AIO operation */ sock_hold(sk); areq->iocb = msg->msg_iocb; + + /* Remember output size that will be generated. */ + areq->outlen = len; + skcipher_request_set_callback(&areq->cra_u.skcipher_req, CRYPTO_TFM_REQ_MAY_SLEEP, af_alg_async_cb, areq); @@ -127,12 +137,8 @@ static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg, crypto_skcipher_decrypt(&areq->cra_u.skcipher_req); /* AIO operation in progress */ - if (err == -EINPROGRESS || err == -EBUSY) { - /* Remember output size that will be generated. */ - areq->outlen = len; - + if (err == -EINPROGRESS || err == -EBUSY) return -EIOCBQUEUED; - } sock_put(sk); } else { diff --git a/crypto/mcryptd.c b/crypto/mcryptd.c index 4e6472658852..eca04d3729b3 100644 --- a/crypto/mcryptd.c +++ b/crypto/mcryptd.c @@ -81,6 +81,7 @@ static int mcryptd_init_queue(struct mcryptd_queue *queue, pr_debug("cpu_queue #%d %p\n", cpu, queue->cpu_queue); crypto_init_queue(&cpu_queue->queue, max_cpu_qlen); INIT_WORK(&cpu_queue->work, mcryptd_queue_worker); + spin_lock_init(&cpu_queue->q_lock); } return 0; } @@ -104,15 +105,16 @@ static int mcryptd_enqueue_request(struct mcryptd_queue *queue, int cpu, err; struct mcryptd_cpu_queue *cpu_queue; - cpu = get_cpu(); - cpu_queue = this_cpu_ptr(queue->cpu_queue); - rctx->tag.cpu = cpu; + cpu_queue = raw_cpu_ptr(queue->cpu_queue); + spin_lock(&cpu_queue->q_lock); + cpu = smp_processor_id(); + rctx->tag.cpu = smp_processor_id(); err = crypto_enqueue_request(&cpu_queue->queue, request); pr_debug("enqueue request: cpu %d cpu_queue %p request %p\n", cpu, cpu_queue, request); + spin_unlock(&cpu_queue->q_lock); queue_work_on(cpu, kcrypto_wq, &cpu_queue->work); - put_cpu(); return err; } @@ -161,16 +163,11 @@ static void mcryptd_queue_worker(struct work_struct *work) cpu_queue = container_of(work, struct mcryptd_cpu_queue, work); i = 0; while (i < MCRYPTD_BATCH || single_task_running()) { - /* - * preempt_disable/enable is used to prevent - * being preempted by mcryptd_enqueue_request() - */ - local_bh_disable(); - preempt_disable(); + + spin_lock_bh(&cpu_queue->q_lock); backlog = crypto_get_backlog(&cpu_queue->queue); req = crypto_dequeue_request(&cpu_queue->queue); - preempt_enable(); - local_bh_enable(); + spin_unlock_bh(&cpu_queue->q_lock); if (!req) { mcryptd_opportunistic_flush(); @@ -185,7 +182,7 @@ static void mcryptd_queue_worker(struct work_struct *work) ++i; } if (cpu_queue->queue.qlen) - queue_work(kcrypto_wq, &cpu_queue->work); + queue_work_on(smp_processor_id(), kcrypto_wq, &cpu_queue->work); } void mcryptd_flusher(struct work_struct *__work) diff --git a/crypto/skcipher.c b/crypto/skcipher.c index 778e0ff42bfa..11af5fd6a443 100644 --- a/crypto/skcipher.c +++ b/crypto/skcipher.c @@ -449,6 +449,8 @@ static int skcipher_walk_skcipher(struct skcipher_walk *walk, walk->total = req->cryptlen; walk->nbytes = 0; + walk->iv = req->iv; + walk->oiv = req->iv; if (unlikely(!walk->total)) return 0; @@ -456,9 +458,6 @@ static int skcipher_walk_skcipher(struct skcipher_walk *walk, scatterwalk_start(&walk->in, req->src); scatterwalk_start(&walk->out, req->dst); - walk->iv = req->iv; - walk->oiv = req->iv; - walk->flags &= ~SKCIPHER_WALK_SLEEP; walk->flags |= req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP ? SKCIPHER_WALK_SLEEP : 0; @@ -510,6 +509,8 @@ static int skcipher_walk_aead_common(struct skcipher_walk *walk, int err; walk->nbytes = 0; + walk->iv = req->iv; + walk->oiv = req->iv; if (unlikely(!walk->total)) return 0; @@ -525,9 +526,6 @@ static int skcipher_walk_aead_common(struct skcipher_walk *walk, scatterwalk_done(&walk->in, 0, walk->total); scatterwalk_done(&walk->out, 0, walk->total); - walk->iv = req->iv; - walk->oiv = req->iv; - if (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) walk->flags |= SKCIPHER_WALK_SLEEP; else diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c index 2c462beee551..a943cf17faa7 100644 --- a/drivers/acpi/apei/erst.c +++ b/drivers/acpi/apei/erst.c @@ -1007,7 +1007,7 @@ static ssize_t erst_reader(struct pstore_record *record) /* The record may be cleared by others, try read next record */ if (len == -ENOENT) goto skip; - else if (len < sizeof(*rcd)) { + else if (len < 0 || len < sizeof(*rcd)) { rc = -EIO; goto out; } diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index 9c2c49b6a240..dea0fb3d6f64 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -1457,6 +1457,11 @@ static int acpi_nfit_add_dimm(struct acpi_nfit_desc *acpi_desc, dev_name(&adev_dimm->dev)); return -ENXIO; } + /* + * Record nfit_mem for the notification path to track back to + * the nfit sysfs attributes for this dimm device object. + */ + dev_set_drvdata(&adev_dimm->dev, nfit_mem); /* * Until standardization materializes we need to consider 4 @@ -1516,9 +1521,11 @@ static void shutdown_dimm_notify(void *data) sysfs_put(nfit_mem->flags_attr); nfit_mem->flags_attr = NULL; } - if (adev_dimm) + if (adev_dimm) { acpi_remove_notify_handler(adev_dimm->handle, ACPI_DEVICE_NOTIFY, acpi_nvdimm_notify); + dev_set_drvdata(&adev_dimm->dev, NULL); + } } mutex_unlock(&acpi_desc->init_mutex); } diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c index e1cbb78c6806..c04aa11f0e21 100644 --- a/drivers/char/ipmi/ipmi_si_intf.c +++ b/drivers/char/ipmi/ipmi_si_intf.c @@ -3469,7 +3469,6 @@ static int add_smi(struct smi_info *new_smi) ipmi_addr_src_to_str(new_smi->addr_source), si_to_str[new_smi->si_type]); rv = -EBUSY; - kfree(new_smi); goto out_err; } } diff --git a/drivers/clk/sunxi/clk-sun9i-mmc.c b/drivers/clk/sunxi/clk-sun9i-mmc.c index 6041bdba2e97..f69f9e8c6f38 100644 --- a/drivers/clk/sunxi/clk-sun9i-mmc.c +++ b/drivers/clk/sunxi/clk-sun9i-mmc.c @@ -16,6 +16,7 @@ #include <linux/clk.h> #include <linux/clk-provider.h> +#include <linux/delay.h> #include <linux/init.h> #include <linux/of.h> #include <linux/of_device.h> @@ -83,9 +84,20 @@ static int sun9i_mmc_reset_deassert(struct reset_controller_dev *rcdev, return 0; } +static int sun9i_mmc_reset_reset(struct reset_controller_dev *rcdev, + unsigned long id) +{ + sun9i_mmc_reset_assert(rcdev, id); + udelay(10); + sun9i_mmc_reset_deassert(rcdev, id); + + return 0; +} + static const struct reset_control_ops sun9i_mmc_reset_ops = { .assert = sun9i_mmc_reset_assert, .deassert = sun9i_mmc_reset_deassert, + .reset = sun9i_mmc_reset_reset, }; static int sun9i_a80_mmc_config_clk_probe(struct platform_device *pdev) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index dc1faa49687d..3b2c0538e48d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -325,17 +325,10 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj) * must wait for all rendering to complete to the object (as unbinding * must anyway), and retire the requests. */ - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED | - I915_WAIT_ALL, - MAX_SCHEDULE_TIMEOUT, - NULL); + ret = i915_gem_object_set_to_cpu_domain(obj, false); if (ret) return ret; - i915_gem_retire_requests(to_i915(obj->base.dev)); - while ((vma = list_first_entry_or_null(&obj->vma_list, struct i915_vma, obj_link))) { diff --git a/drivers/gpu/drm/sun4i/sun4i_tcon.c b/drivers/gpu/drm/sun4i/sun4i_tcon.c index d9791292553e..7b909d814d38 100644 --- a/drivers/gpu/drm/sun4i/sun4i_tcon.c +++ b/drivers/gpu/drm/sun4i/sun4i_tcon.c @@ -567,12 +567,12 @@ static int sun4i_tcon_bind(struct device *dev, struct device *master, if (IS_ERR(tcon->crtc)) { dev_err(dev, "Couldn't create our CRTC\n"); ret = PTR_ERR(tcon->crtc); - goto err_free_clocks; + goto err_free_dotclock; } ret = sun4i_rgb_init(drm, tcon); if (ret < 0) - goto err_free_clocks; + goto err_free_dotclock; list_add_tail(&tcon->list, &drv->tcon_list); diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c index c9714072e224..a14196e95e9b 100644 --- a/drivers/mfd/cros_ec_spi.c +++ b/drivers/mfd/cros_ec_spi.c @@ -667,6 +667,7 @@ static int cros_ec_spi_probe(struct spi_device *spi) sizeof(struct ec_response_get_protocol_info); ec_dev->dout_size = sizeof(struct ec_host_request); + ec_spi->last_transfer_ns = ktime_get_ns(); err = cros_ec_register(ec_dev); if (err) { diff --git a/drivers/mfd/twl4030-audio.c b/drivers/mfd/twl4030-audio.c index da16bf45fab4..dc94ffc6321a 100644 --- a/drivers/mfd/twl4030-audio.c +++ b/drivers/mfd/twl4030-audio.c @@ -159,13 +159,18 @@ unsigned int twl4030_audio_get_mclk(void) EXPORT_SYMBOL_GPL(twl4030_audio_get_mclk); static bool twl4030_audio_has_codec(struct twl4030_audio_data *pdata, - struct device_node *node) + struct device_node *parent) { + struct device_node *node; + if (pdata && pdata->codec) return true; - if (of_find_node_by_name(node, "codec")) + node = of_get_child_by_name(parent, "codec"); + if (node) { + of_node_put(node); return true; + } return false; } diff --git a/drivers/mfd/twl6040.c b/drivers/mfd/twl6040.c index d66502d36ba0..dd19f17a1b63 100644 --- a/drivers/mfd/twl6040.c +++ b/drivers/mfd/twl6040.c @@ -97,12 +97,16 @@ static struct reg_sequence twl6040_patch[] = { }; -static bool twl6040_has_vibra(struct device_node *node) +static bool twl6040_has_vibra(struct device_node *parent) { -#ifdef CONFIG_OF - if (of_find_node_by_name(node, "vibra")) + struct device_node *node; + + node = of_get_child_by_name(parent, "vibra"); + if (node) { + of_node_put(node); return true; -#endif + } + return false; } diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index bc93b69cfd1e..a539263cd79c 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -1214,6 +1214,10 @@ static void mvneta_port_disable(struct mvneta_port *pp) val &= ~MVNETA_GMAC0_PORT_ENABLE; mvreg_write(pp, MVNETA_GMAC_CTRL_0, val); + pp->link = 0; + pp->duplex = -1; + pp->speed = 0; + udelay(200); } @@ -1958,9 +1962,9 @@ static int mvneta_rx_swbm(struct mvneta_port *pp, int rx_todo, if (!mvneta_rxq_desc_is_first_last(rx_status) || (rx_status & MVNETA_RXD_ERR_SUMMARY)) { + mvneta_rx_error(pp, rx_desc); err_drop_frame: dev->stats.rx_errors++; - mvneta_rx_error(pp, rx_desc); /* leave the descriptor untouched */ continue; } @@ -3011,7 +3015,7 @@ static void mvneta_cleanup_rxqs(struct mvneta_port *pp) { int queue; - for (queue = 0; queue < txq_number; queue++) + for (queue = 0; queue < rxq_number; queue++) mvneta_rxq_deinit(pp, &pp->rxqs[queue]); } diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c index d5612bd1cc81..09428ebd315b 100644 --- a/drivers/nvdimm/btt.c +++ b/drivers/nvdimm/btt.c @@ -210,12 +210,12 @@ static int btt_map_read(struct arena_info *arena, u32 lba, u32 *mapping, return ret; } -static int btt_log_read_pair(struct arena_info *arena, u32 lane, - struct log_entry *ent) +static int btt_log_group_read(struct arena_info *arena, u32 lane, + struct log_group *log) { return arena_read_bytes(arena, - arena->logoff + (2 * lane * LOG_ENT_SIZE), ent, - 2 * LOG_ENT_SIZE, 0); + arena->logoff + (lane * LOG_GRP_SIZE), log, + LOG_GRP_SIZE, 0); } static struct dentry *debugfs_root; @@ -255,6 +255,8 @@ static void arena_debugfs_init(struct arena_info *a, struct dentry *parent, debugfs_create_x64("logoff", S_IRUGO, d, &a->logoff); debugfs_create_x64("info2off", S_IRUGO, d, &a->info2off); debugfs_create_x32("flags", S_IRUGO, d, &a->flags); + debugfs_create_u32("log_index_0", S_IRUGO, d, &a->log_index[0]); + debugfs_create_u32("log_index_1", S_IRUGO, d, &a->log_index[1]); } static void btt_debugfs_init(struct btt *btt) @@ -273,6 +275,11 @@ static void btt_debugfs_init(struct btt *btt) } } +static u32 log_seq(struct log_group *log, int log_idx) +{ + return le32_to_cpu(log->ent[log_idx].seq); +} + /* * This function accepts two log entries, and uses the * sequence number to find the 'older' entry. @@ -282,8 +289,10 @@ static void btt_debugfs_init(struct btt *btt) * * TODO The logic feels a bit kludge-y. make it better.. */ -static int btt_log_get_old(struct log_entry *ent) +static int btt_log_get_old(struct arena_info *a, struct log_group *log) { + int idx0 = a->log_index[0]; + int idx1 = a->log_index[1]; int old; /* @@ -291,23 +300,23 @@ static int btt_log_get_old(struct log_entry *ent) * the next time, the following logic works out to put this * (next) entry into [1] */ - if (ent[0].seq == 0) { - ent[0].seq = cpu_to_le32(1); + if (log_seq(log, idx0) == 0) { + log->ent[idx0].seq = cpu_to_le32(1); return 0; } - if (ent[0].seq == ent[1].seq) + if (log_seq(log, idx0) == log_seq(log, idx1)) return -EINVAL; - if (le32_to_cpu(ent[0].seq) + le32_to_cpu(ent[1].seq) > 5) + if (log_seq(log, idx0) + log_seq(log, idx1) > 5) return -EINVAL; - if (le32_to_cpu(ent[0].seq) < le32_to_cpu(ent[1].seq)) { - if (le32_to_cpu(ent[1].seq) - le32_to_cpu(ent[0].seq) == 1) + if (log_seq(log, idx0) < log_seq(log, idx1)) { + if ((log_seq(log, idx1) - log_seq(log, idx0)) == 1) old = 0; else old = 1; } else { - if (le32_to_cpu(ent[0].seq) - le32_to_cpu(ent[1].seq) == 1) + if ((log_seq(log, idx0) - log_seq(log, idx1)) == 1) old = 1; else old = 0; @@ -327,17 +336,18 @@ static int btt_log_read(struct arena_info *arena, u32 lane, { int ret; int old_ent, ret_ent; - struct log_entry log[2]; + struct log_group log; - ret = btt_log_read_pair(arena, lane, log); + ret = btt_log_group_read(arena, lane, &log); if (ret) return -EIO; - old_ent = btt_log_get_old(log); + old_ent = btt_log_get_old(arena, &log); if (old_ent < 0 || old_ent > 1) { dev_err(to_dev(arena), "log corruption (%d): lane %d seq [%d, %d]\n", - old_ent, lane, log[0].seq, log[1].seq); + old_ent, lane, log.ent[arena->log_index[0]].seq, + log.ent[arena->log_index[1]].seq); /* TODO set error state? */ return -EIO; } @@ -345,7 +355,7 @@ static int btt_log_read(struct arena_info *arena, u32 lane, ret_ent = (old_flag ? old_ent : (1 - old_ent)); if (ent != NULL) - memcpy(ent, &log[ret_ent], LOG_ENT_SIZE); + memcpy(ent, &log.ent[arena->log_index[ret_ent]], LOG_ENT_SIZE); return ret_ent; } @@ -359,17 +369,13 @@ static int __btt_log_write(struct arena_info *arena, u32 lane, u32 sub, struct log_entry *ent, unsigned long flags) { int ret; - /* - * Ignore the padding in log_entry for calculating log_half. - * The entry is 'committed' when we write the sequence number, - * and we want to ensure that that is the last thing written. - * We don't bother writing the padding as that would be extra - * media wear and write amplification - */ - unsigned int log_half = (LOG_ENT_SIZE - 2 * sizeof(u64)) / 2; - u64 ns_off = arena->logoff + (((2 * lane) + sub) * LOG_ENT_SIZE); + u32 group_slot = arena->log_index[sub]; + unsigned int log_half = LOG_ENT_SIZE / 2; void *src = ent; + u64 ns_off; + ns_off = arena->logoff + (lane * LOG_GRP_SIZE) + + (group_slot * LOG_ENT_SIZE); /* split the 16B write into atomic, durable halves */ ret = arena_write_bytes(arena, ns_off, src, log_half, flags); if (ret) @@ -452,7 +458,7 @@ static int btt_log_init(struct arena_info *arena) { size_t logsize = arena->info2off - arena->logoff; size_t chunk_size = SZ_4K, offset = 0; - struct log_entry log; + struct log_entry ent; void *zerobuf; int ret; u32 i; @@ -484,11 +490,11 @@ static int btt_log_init(struct arena_info *arena) } for (i = 0; i < arena->nfree; i++) { - log.lba = cpu_to_le32(i); - log.old_map = cpu_to_le32(arena->external_nlba + i); - log.new_map = cpu_to_le32(arena->external_nlba + i); - log.seq = cpu_to_le32(LOG_SEQ_INIT); - ret = __btt_log_write(arena, i, 0, &log, 0); + ent.lba = cpu_to_le32(i); + ent.old_map = cpu_to_le32(arena->external_nlba + i); + ent.new_map = cpu_to_le32(arena->external_nlba + i); + ent.seq = cpu_to_le32(LOG_SEQ_INIT); + ret = __btt_log_write(arena, i, 0, &ent, 0); if (ret) goto free; } @@ -593,6 +599,123 @@ static int btt_freelist_init(struct arena_info *arena) return 0; } +static bool ent_is_padding(struct log_entry *ent) +{ + return (ent->lba == 0) && (ent->old_map == 0) && (ent->new_map == 0) + && (ent->seq == 0); +} + +/* + * Detecting valid log indices: We read a log group (see the comments in btt.h + * for a description of a 'log_group' and its 'slots'), and iterate over its + * four slots. We expect that a padding slot will be all-zeroes, and use this + * to detect a padding slot vs. an actual entry. + * + * If a log_group is in the initial state, i.e. hasn't been used since the + * creation of this BTT layout, it will have three of the four slots with + * zeroes. We skip over these log_groups for the detection of log_index. If + * all log_groups are in the initial state (i.e. the BTT has never been + * written to), it is safe to assume the 'new format' of log entries in slots + * (0, 1). + */ +static int log_set_indices(struct arena_info *arena) +{ + bool idx_set = false, initial_state = true; + int ret, log_index[2] = {-1, -1}; + u32 i, j, next_idx = 0; + struct log_group log; + u32 pad_count = 0; + + for (i = 0; i < arena->nfree; i++) { + ret = btt_log_group_read(arena, i, &log); + if (ret < 0) + return ret; + + for (j = 0; j < 4; j++) { + if (!idx_set) { + if (ent_is_padding(&log.ent[j])) { + pad_count++; + continue; + } else { + /* Skip if index has been recorded */ + if ((next_idx == 1) && + (j == log_index[0])) + continue; + /* valid entry, record index */ + log_index[next_idx] = j; + next_idx++; + } + if (next_idx == 2) { + /* two valid entries found */ + idx_set = true; + } else if (next_idx > 2) { + /* too many valid indices */ + return -ENXIO; + } + } else { + /* + * once the indices have been set, just verify + * that all subsequent log groups are either in + * their initial state or follow the same + * indices. + */ + if (j == log_index[0]) { + /* entry must be 'valid' */ + if (ent_is_padding(&log.ent[j])) + return -ENXIO; + } else if (j == log_index[1]) { + ; + /* + * log_index[1] can be padding if the + * lane never got used and it is still + * in the initial state (three 'padding' + * entries) + */ + } else { + /* entry must be invalid (padding) */ + if (!ent_is_padding(&log.ent[j])) + return -ENXIO; + } + } + } + /* + * If any of the log_groups have more than one valid, + * non-padding entry, then the we are no longer in the + * initial_state + */ + if (pad_count < 3) + initial_state = false; + pad_count = 0; + } + + if (!initial_state && !idx_set) + return -ENXIO; + + /* + * If all the entries in the log were in the initial state, + * assume new padding scheme + */ + if (initial_state) + log_index[1] = 1; + + /* + * Only allow the known permutations of log/padding indices, + * i.e. (0, 1), and (0, 2) + */ + if ((log_index[0] == 0) && ((log_index[1] == 1) || (log_index[1] == 2))) + ; /* known index possibilities */ + else { + dev_err(to_dev(arena), "Found an unknown padding scheme\n"); + return -ENXIO; + } + + arena->log_index[0] = log_index[0]; + arena->log_index[1] = log_index[1]; + dev_dbg(to_dev(arena), "log_index_0 = %d\n", log_index[0]); + dev_dbg(to_dev(arena), "log_index_1 = %d\n", log_index[1]); + return 0; +} + static int btt_rtt_init(struct arena_info *arena) { arena->rtt = kcalloc(arena->nfree, sizeof(u32), GFP_KERNEL); @@ -649,8 +772,7 @@ static struct arena_info *alloc_arena(struct btt *btt, size_t size, available -= 2 * BTT_PG_SIZE; /* The log takes a fixed amount of space based on nfree */ - logsize = roundup(2 * arena->nfree * sizeof(struct log_entry), - BTT_PG_SIZE); + logsize = roundup(arena->nfree * LOG_GRP_SIZE, BTT_PG_SIZE); available -= logsize; /* Calculate optimal split between map and data area */ @@ -667,6 +789,10 @@ static struct arena_info *alloc_arena(struct btt *btt, size_t size, arena->mapoff = arena->dataoff + datasize; arena->logoff = arena->mapoff + mapsize; arena->info2off = arena->logoff + logsize; + + /* Default log indices are (0,1) */ + arena->log_index[0] = 0; + arena->log_index[1] = 1; return arena; } @@ -757,6 +883,13 @@ static int discover_arenas(struct btt *btt) arena->external_lba_start = cur_nlba; parse_arena_meta(arena, super, cur_off); + ret = log_set_indices(arena); + if (ret) { + dev_err(to_dev(arena), + "Unable to deduce log/padding indices\n"); + goto out; + } + mutex_init(&arena->err_lock); ret = btt_freelist_init(arena); if (ret) diff --git a/drivers/nvdimm/btt.h b/drivers/nvdimm/btt.h index 578c2057524d..2609683c4167 100644 --- a/drivers/nvdimm/btt.h +++ b/drivers/nvdimm/btt.h @@ -27,6 +27,7 @@ #define MAP_ERR_MASK (1 << MAP_ERR_SHIFT) #define MAP_LBA_MASK (~((1 << MAP_TRIM_SHIFT) | (1 << MAP_ERR_SHIFT))) #define MAP_ENT_NORMAL 0xC0000000 +#define LOG_GRP_SIZE sizeof(struct log_group) #define LOG_ENT_SIZE sizeof(struct log_entry) #define ARENA_MIN_SIZE (1UL << 24) /* 16 MB */ #define ARENA_MAX_SIZE (1ULL << 39) /* 512 GB */ @@ -50,12 +51,52 @@ enum btt_init_state { INIT_READY }; +/* + * A log group represents one log 'lane', and consists of four log entries. + * Two of the four entries are valid entries, and the remaining two are + * padding. Due to an old bug in the padding location, we need to perform a + * test to determine the padding scheme being used, and use that scheme + * thereafter. + * + * In kernels prior to 4.15, 'log group' would have actual log entries at + * indices (0, 2) and padding at indices (1, 3), where as the correct/updated + * format has log entries at indices (0, 1) and padding at indices (2, 3). + * + * Old (pre 4.15) format: + * +-----------------+-----------------+ + * | ent[0] | ent[1] | + * | 16B | 16B | + * | lba/old/new/seq | pad | + * +-----------------------------------+ + * | ent[2] | ent[3] | + * | 16B | 16B | + * | lba/old/new/seq | pad | + * +-----------------+-----------------+ + * + * New format: + * +-----------------+-----------------+ + * | ent[0] | ent[1] | + * | 16B | 16B | + * | lba/old/new/seq | lba/old/new/seq | + * +-----------------------------------+ + * | ent[2] | ent[3] | + * | 16B | 16B | + * | pad | pad | + * +-----------------+-----------------+ + * + * We detect during start-up which format is in use, and set + * arena->log_index[(0, 1)] with the detected format. + */ + struct log_entry { __le32 lba; __le32 old_map; __le32 new_map; __le32 seq; - __le64 padding[2]; +}; + +struct log_group { + struct log_entry ent[4]; }; struct btt_sb { @@ -125,6 +166,7 @@ struct aligned_lock { * @list: List head for list of arenas * @debugfs_dir: Debugfs dentry * @flags: Arena flags - may signify error states. + * @log_index: Indices of the valid log entries in a log_group * * arena_info is a per-arena handle. Once an arena is narrowed down for an * IO, this struct is passed around for the duration of the IO. @@ -157,6 +199,7 @@ struct arena_info { /* Arena flags */ u32 flags; struct mutex err_lock; + int log_index[2]; }; /** diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index 65cc171c721d..2adada1a5855 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -364,9 +364,9 @@ struct device *nd_pfn_create(struct nd_region *nd_region) int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) { u64 checksum, offset; - unsigned long align; enum nd_pfn_mode mode; struct nd_namespace_io *nsio; + unsigned long align, start_pad; struct nd_pfn_sb *pfn_sb = nd_pfn->pfn_sb; struct nd_namespace_common *ndns = nd_pfn->ndns; const u8 *parent_uuid = nd_dev_to_uuid(&ndns->dev); @@ -410,6 +410,7 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) align = le32_to_cpu(pfn_sb->align); offset = le64_to_cpu(pfn_sb->dataoff); + start_pad = le32_to_cpu(pfn_sb->start_pad); if (align == 0) align = 1UL << ilog2(offset); mode = le32_to_cpu(pfn_sb->mode); @@ -468,7 +469,7 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) return -EBUSY; } - if ((align && !IS_ALIGNED(offset, align)) + if ((align && !IS_ALIGNED(nsio->res.start + offset + start_pad, align)) || !IS_ALIGNED(offset, PAGE_SIZE)) { dev_err(&nd_pfn->dev, "bad offset: %#llx dax disabled align: %#lx\n", @@ -582,6 +583,12 @@ static struct vmem_altmap *__nvdimm_setup_pfn(struct nd_pfn *nd_pfn, return altmap; } +static u64 phys_pmem_align_down(struct nd_pfn *nd_pfn, u64 phys) +{ + return min_t(u64, PHYS_SECTION_ALIGN_DOWN(phys), + ALIGN_DOWN(phys, nd_pfn->align)); +} + static int nd_pfn_init(struct nd_pfn *nd_pfn) { u32 dax_label_reserve = is_nd_dax(&nd_pfn->dev) ? SZ_128K : 0; @@ -637,13 +644,16 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) start = nsio->res.start; size = PHYS_SECTION_ALIGN_UP(start + size) - start; if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM, - IORES_DESC_NONE) == REGION_MIXED) { + IORES_DESC_NONE) == REGION_MIXED + || !IS_ALIGNED(start + resource_size(&nsio->res), + nd_pfn->align)) { size = resource_size(&nsio->res); - end_trunc = start + size - PHYS_SECTION_ALIGN_DOWN(start + size); + end_trunc = start + size - phys_pmem_align_down(nd_pfn, + start + size); } if (start_pad + end_trunc) - dev_info(&nd_pfn->dev, "%s section collision, truncate %d bytes\n", + dev_info(&nd_pfn->dev, "%s alignment collision, truncate %d bytes\n", dev_name(&ndns->dev), start_pad + end_trunc); /* diff --git a/drivers/parisc/lba_pci.c b/drivers/parisc/lba_pci.c index a25fed52f7e9..41b740aed3a3 100644 --- a/drivers/parisc/lba_pci.c +++ b/drivers/parisc/lba_pci.c @@ -1692,3 +1692,36 @@ void lba_set_iregs(struct parisc_device *lba, u32 ibase, u32 imask) iounmap(base_addr); } + +/* + * The design of the Diva management card in rp34x0 machines (rp3410, rp3440) + * seems rushed, so that many built-in components simply don't work. + * The following quirks disable the serial AUX port and the built-in ATI RV100 + * Radeon 7000 graphics card which both don't have any external connectors and + * thus are useless, and even worse, e.g. the AUX port occupies ttyS0 and as + * such makes those machines the only PARISC machines on which we can't use + * ttyS0 as boot console. + */ +static void quirk_diva_ati_card(struct pci_dev *dev) +{ + if (dev->subsystem_vendor != PCI_VENDOR_ID_HP || + dev->subsystem_device != 0x1292) + return; + + dev_info(&dev->dev, "Hiding Diva built-in ATI card"); + dev->device = 0; +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_RADEON_QY, + quirk_diva_ati_card); + +static void quirk_diva_aux_disable(struct pci_dev *dev) +{ + if (dev->subsystem_vendor != PCI_VENDOR_ID_HP || + dev->subsystem_device != 0x1291) + return; + + dev_info(&dev->dev, "Hiding Diva built-in AUX serial device"); + dev->device = 0; +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_DIVA_AUX, + quirk_diva_aux_disable); diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 11bd267fc137..bb0927de79dd 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -968,7 +968,12 @@ static int pci_pm_thaw_noirq(struct device *dev) if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_resume_early(dev); - pci_update_current_state(pci_dev, PCI_D0); + /* + * pci_restore_state() requires the device to be in D0 (because of MSI + * restoration among other things), so force it into D0 in case the + * driver's "freeze" callbacks put it into a low-power state directly. + */ + pci_set_power_state(pci_dev, PCI_D0); pci_restore_state(pci_dev); if (drv && drv->pm && drv->pm->thaw_noirq) diff --git a/drivers/pinctrl/intel/pinctrl-cherryview.c b/drivers/pinctrl/intel/pinctrl-cherryview.c index fadbca907c7c..0907531a02ca 100644 --- a/drivers/pinctrl/intel/pinctrl-cherryview.c +++ b/drivers/pinctrl/intel/pinctrl-cherryview.c @@ -1620,6 +1620,22 @@ static int chv_gpio_probe(struct chv_pinctrl *pctrl, int irq) clear_bit(i, chip->irq_valid_mask); } + /* + * The same set of machines in chv_no_valid_mask[] have incorrectly + * configured GPIOs that generate spurious interrupts so we use + * this same list to apply another quirk for them. + * + * See also https://bugzilla.kernel.org/show_bug.cgi?id=197953. + */ + if (!need_valid_mask) { + /* + * Mask all interrupts the community is able to generate + * but leave the ones that can only generate GPEs unmasked. + */ + chv_writel(GENMASK(31, pctrl->community->nirqs), + pctrl->regs + CHV_INTMASK); + } + /* Clear all interrupts */ chv_writel(0xffff, pctrl->regs + CHV_INTSTAT); diff --git a/drivers/spi/spi-armada-3700.c b/drivers/spi/spi-armada-3700.c index 568e1c65aa82..fe3fa1e8517a 100644 --- a/drivers/spi/spi-armada-3700.c +++ b/drivers/spi/spi-armada-3700.c @@ -79,6 +79,7 @@ #define A3700_SPI_BYTE_LEN BIT(5) #define A3700_SPI_CLK_PRESCALE BIT(0) #define A3700_SPI_CLK_PRESCALE_MASK (0x1f) +#define A3700_SPI_CLK_EVEN_OFFS (0x10) #define A3700_SPI_WFIFO_THRS_BIT 28 #define A3700_SPI_RFIFO_THRS_BIT 24 @@ -220,6 +221,13 @@ static void a3700_spi_clock_set(struct a3700_spi *a3700_spi, prescale = DIV_ROUND_UP(clk_get_rate(a3700_spi->clk), speed_hz); + /* For prescaler values over 15, we can only set it by steps of 2. + * Starting from A3700_SPI_CLK_EVEN_OFFS, we set values from 0 up to + * 30. We only use this range from 16 to 30. + */ + if (prescale > 15) + prescale = A3700_SPI_CLK_EVEN_OFFS + DIV_ROUND_UP(prescale, 2); + val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG); val = val & ~A3700_SPI_CLK_PRESCALE_MASK; diff --git a/drivers/spi/spi-xilinx.c b/drivers/spi/spi-xilinx.c index bc7100b93dfc..e0b9fe1d0e37 100644 --- a/drivers/spi/spi-xilinx.c +++ b/drivers/spi/spi-xilinx.c @@ -271,6 +271,7 @@ static int xilinx_spi_txrx_bufs(struct spi_device *spi, struct spi_transfer *t) while (remaining_words) { int n_words, tx_words, rx_words; u32 sr; + int stalled; n_words = min(remaining_words, xspi->buffer_size); @@ -299,7 +300,17 @@ static int xilinx_spi_txrx_bufs(struct spi_device *spi, struct spi_transfer *t) /* Read out all the data from the Rx FIFO */ rx_words = n_words; + stalled = 10; while (rx_words) { + if (rx_words == n_words && !(stalled--) && + !(sr & XSPI_SR_TX_EMPTY_MASK) && + (sr & XSPI_SR_RX_EMPTY_MASK)) { + dev_err(&spi->dev, + "Detected stall. Check C_SPI_MODE and C_SPI_MEMORY\n"); + xspi_init_hw(xspi); + return -EIO; + } + if ((sr & XSPI_SR_TX_EMPTY_MASK) && (rx_words > 1)) { xilinx_spi_rx(xspi); rx_words--; diff --git a/include/asm-generic/mm_hooks.h b/include/asm-generic/mm_hooks.h index ea189d88a3cc..8ac4e68a12f0 100644 --- a/include/asm-generic/mm_hooks.h +++ b/include/asm-generic/mm_hooks.h @@ -7,9 +7,10 @@ #ifndef _ASM_GENERIC_MM_HOOKS_H #define _ASM_GENERIC_MM_HOOKS_H -static inline void arch_dup_mmap(struct mm_struct *oldmm, - struct mm_struct *mm) +static inline int arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) { + return 0; } static inline void arch_exit_mmap(struct mm_struct *mm) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 1ac457511f4e..045a7f52ab3a 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -1025,6 +1025,11 @@ static inline int pmd_clear_huge(pmd_t *pmd) struct file; int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn, unsigned long size, pgprot_t *vma_prot); + +#ifndef CONFIG_X86_ESPFIX64 +static inline void init_espfix_bsp(void) { } +#endif + #endif /* !__ASSEMBLY__ */ #ifndef io_remap_pfn_range diff --git a/include/crypto/mcryptd.h b/include/crypto/mcryptd.h index cceafa01f907..b67404fc4b34 100644 --- a/include/crypto/mcryptd.h +++ b/include/crypto/mcryptd.h @@ -27,6 +27,7 @@ static inline struct mcryptd_ahash *__mcryptd_ahash_cast( struct mcryptd_cpu_queue { struct crypto_queue queue; + spinlock_t q_lock; struct work_struct work; }; diff --git a/include/linux/bio.h b/include/linux/bio.h index 275c91c99516..45f00dd6323c 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -504,6 +504,8 @@ extern unsigned int bvec_nr_vecs(unsigned short idx); #define bio_set_dev(bio, bdev) \ do { \ + if ((bio)->bi_disk != (bdev)->bd_disk) \ + bio_clear_flag(bio, BIO_THROTTLED);\ (bio)->bi_disk = (bdev)->bd_disk; \ (bio)->bi_partno = (bdev)->bd_partno; \ } while (0) diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 96ac3815542c..1c8a8a2aedf7 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -50,8 +50,6 @@ struct blk_issue_stat { struct bio { struct bio *bi_next; /* request queue link */ struct gendisk *bi_disk; - u8 bi_partno; - blk_status_t bi_status; unsigned int bi_opf; /* bottom bits req flags, * top bits REQ_OP. Use * accessors. @@ -59,8 +57,8 @@ struct bio { unsigned short bi_flags; /* status, etc and bvec pool number */ unsigned short bi_ioprio; unsigned short bi_write_hint; - - struct bvec_iter bi_iter; + blk_status_t bi_status; + u8 bi_partno; /* Number of segments in this BIO after * physical address coalescing is performed. @@ -74,8 +72,9 @@ struct bio { unsigned int bi_seg_front_size; unsigned int bi_seg_back_size; - atomic_t __bi_remaining; + struct bvec_iter bi_iter; + atomic_t __bi_remaining; bio_end_io_t *bi_end_io; void *bi_private; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 8da66379f7ea..fd47bd96b5d3 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -135,7 +135,7 @@ typedef __u32 __bitwise req_flags_t; struct request { struct list_head queuelist; union { - call_single_data_t csd; + struct __call_single_data csd; u64 fifo_time; }; diff --git a/init/main.c b/init/main.c index 0ee9c6866ada..8a390f60ec81 100644 --- a/init/main.c +++ b/init/main.c @@ -504,6 +504,8 @@ static void __init mm_init(void) pgtable_init(); vmalloc_init(); ioremap_huge_init(); + /* Should be run before the first non-init thread is created */ + init_espfix_bsp(); } asmlinkage __visible void __init start_kernel(void) @@ -673,10 +675,6 @@ asmlinkage __visible void __init start_kernel(void) #ifdef CONFIG_X86 if (efi_enabled(EFI_RUNTIME_SERVICES)) efi_enter_virtual_mode(); -#endif -#ifdef CONFIG_X86_ESPFIX64 - /* Should be run before the first non-init thread is created */ - init_espfix_bsp(); #endif thread_stack_cache_init(); cred_init(); diff --git a/kernel/fork.c b/kernel/fork.c index 07cc743698d3..500ce64517d9 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -721,8 +721,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, goto out; } /* a new mm has just been created */ - arch_dup_mmap(oldmm, mm); - retval = 0; + retval = arch_dup_mmap(oldmm, mm); out: up_write(&mm->mmap_sem); flush_tlb_mm(oldmm); diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 76b47682f77f..598efa8cfe25 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1055,6 +1055,7 @@ static struct rt6_info *rt6_get_pcpu_route(struct rt6_info *rt) static struct rt6_info *rt6_make_pcpu_route(struct rt6_info *rt) { + struct fib6_table *table = rt->rt6i_table; struct rt6_info *pcpu_rt, *prev, **p; pcpu_rt = ip6_rt_pcpu_alloc(rt); @@ -1065,20 +1066,28 @@ static struct rt6_info *rt6_make_pcpu_route(struct rt6_info *rt) return net->ipv6.ip6_null_entry; } - dst_hold(&pcpu_rt->dst); - p = this_cpu_ptr(rt->rt6i_pcpu); - prev = cmpxchg(p, NULL, pcpu_rt); - if (prev) { - /* If someone did it before us, return prev instead */ - /* release refcnt taken by ip6_rt_pcpu_alloc() */ - dst_release_immediate(&pcpu_rt->dst); - /* release refcnt taken by above dst_hold() */ + read_lock_bh(&table->tb6_lock); + if (rt->rt6i_pcpu) { + p = this_cpu_ptr(rt->rt6i_pcpu); + prev = cmpxchg(p, NULL, pcpu_rt); + if (prev) { + /* If someone did it before us, return prev instead */ + dst_release_immediate(&pcpu_rt->dst); + pcpu_rt = prev; + } + } else { + /* rt has been removed from the fib6 tree + * before we have a chance to acquire the read_lock. + * In this case, don't brother to create a pcpu rt + * since rt is going away anyway. The next + * dst_check() will trigger a re-lookup. + */ dst_release_immediate(&pcpu_rt->dst); - dst_hold(&prev->dst); - pcpu_rt = prev; + pcpu_rt = rt; } - + dst_hold(&pcpu_rt->dst); rt6_dst_from_metrics_check(pcpu_rt); + read_unlock_bh(&table->tb6_lock); return pcpu_rt; } @@ -1168,28 +1177,19 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table, if (pcpu_rt) { read_unlock_bh(&table->tb6_lock); } else { - /* atomic_inc_not_zero() is needed when using rcu */ - if (atomic_inc_not_zero(&rt->rt6i_ref)) { - /* We have to do the read_unlock first - * because rt6_make_pcpu_route() may trigger - * ip6_dst_gc() which will take the write_lock. - * - * No dst_hold() on rt is needed because grabbing - * rt->rt6i_ref makes sure rt can't be released. - */ - read_unlock_bh(&table->tb6_lock); - pcpu_rt = rt6_make_pcpu_route(rt); - rt6_release(rt); - } else { - /* rt is already removed from tree */ - read_unlock_bh(&table->tb6_lock); - pcpu_rt = net->ipv6.ip6_null_entry; - dst_hold(&pcpu_rt->dst); - } + /* We have to do the read_unlock first + * because rt6_make_pcpu_route() may trigger + * ip6_dst_gc() which will take the write_lock. + */ + dst_hold(&rt->dst); + read_unlock_bh(&table->tb6_lock); + pcpu_rt = rt6_make_pcpu_route(rt); + dst_release(&rt->dst); } trace_fib6_table_lookup(net, pcpu_rt, table->tb6_id, fl6); return pcpu_rt; + } } EXPORT_SYMBOL_GPL(ip6_pol_route); diff --git a/sound/core/rawmidi.c b/sound/core/rawmidi.c index b3b353d72527..f055ca10bbc1 100644 --- a/sound/core/rawmidi.c +++ b/sound/core/rawmidi.c @@ -579,15 +579,14 @@ static int snd_rawmidi_info_user(struct snd_rawmidi_substream *substream, return 0; } -int snd_rawmidi_info_select(struct snd_card *card, struct snd_rawmidi_info *info) +static int __snd_rawmidi_info_select(struct snd_card *card, + struct snd_rawmidi_info *info) { struct snd_rawmidi *rmidi; struct snd_rawmidi_str *pstr; struct snd_rawmidi_substream *substream; - mutex_lock(&register_mutex); rmidi = snd_rawmidi_search(card, info->device); - mutex_unlock(&register_mutex); if (!rmidi) return -ENXIO; if (info->stream < 0 || info->stream > 1) @@ -603,6 +602,16 @@ int snd_rawmidi_info_select(struct snd_card *card, struct snd_rawmidi_info *info } return -ENXIO; } + +int snd_rawmidi_info_select(struct snd_card *card, struct snd_rawmidi_info *info) +{ + int ret; + + mutex_lock(&register_mutex); + ret = __snd_rawmidi_info_select(card, info); + mutex_unlock(&register_mutex); + return ret; +} EXPORT_SYMBOL(snd_rawmidi_info_select); static int snd_rawmidi_info_select_user(struct snd_card *card, diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c index c19c81d230bd..b4f1b6e88305 100644 --- a/sound/pci/hda/patch_hdmi.c +++ b/sound/pci/hda/patch_hdmi.c @@ -55,10 +55,11 @@ MODULE_PARM_DESC(static_hdmi_pcm, "Don't restrict PCM parameters per ELD info"); #define is_kabylake(codec) ((codec)->core.vendor_id == 0x8086280b) #define is_geminilake(codec) (((codec)->core.vendor_id == 0x8086280d) || \ ((codec)->core.vendor_id == 0x80862800)) +#define is_cannonlake(codec) ((codec)->core.vendor_id == 0x8086280c) #define is_haswell_plus(codec) (is_haswell(codec) || is_broadwell(codec) \ || is_skylake(codec) || is_broxton(codec) \ - || is_kabylake(codec)) || is_geminilake(codec) - + || is_kabylake(codec)) || is_geminilake(codec) \ + || is_cannonlake(codec) #define is_valleyview(codec) ((codec)->core.vendor_id == 0x80862882) #define is_cherryview(codec) ((codec)->core.vendor_id == 0x80862883) #define is_valleyview_plus(codec) (is_valleyview(codec) || is_cherryview(codec)) @@ -3841,6 +3842,7 @@ HDA_CODEC_ENTRY(0x80862808, "Broadwell HDMI", patch_i915_hsw_hdmi), HDA_CODEC_ENTRY(0x80862809, "Skylake HDMI", patch_i915_hsw_hdmi), HDA_CODEC_ENTRY(0x8086280a, "Broxton HDMI", patch_i915_hsw_hdmi), HDA_CODEC_ENTRY(0x8086280b, "Kabylake HDMI", patch_i915_hsw_hdmi), +HDA_CODEC_ENTRY(0x8086280c, "Cannonlake HDMI", patch_i915_glk_hdmi), HDA_CODEC_ENTRY(0x8086280d, "Geminilake HDMI", patch_i915_glk_hdmi), HDA_CODEC_ENTRY(0x80862800, "Geminilake HDMI", patch_i915_glk_hdmi), HDA_CODEC_ENTRY(0x80862880, "CedarTrail HDMI", patch_generic_hdmi), diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index b076386c8952..9ac4b9076ee2 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -5162,6 +5162,22 @@ static void alc233_alc662_fixup_lenovo_dual_codecs(struct hda_codec *codec, } } +/* Forcibly assign NID 0x03 to HP/LO while NID 0x02 to SPK for EQ */ +static void alc274_fixup_bind_dacs(struct hda_codec *codec, + const struct hda_fixup *fix, int action) +{ + struct alc_spec *spec = codec->spec; + static hda_nid_t preferred_pairs[] = { + 0x21, 0x03, 0x1b, 0x03, 0x16, 0x02, + 0 + }; + + if (action != HDA_FIXUP_ACT_PRE_PROBE) + return; + + spec->gen.preferred_dacs = preferred_pairs; +} + /* for hda_fixup_thinkpad_acpi() */ #include "thinkpad_helper.c" @@ -5279,6 +5295,8 @@ enum { ALC233_FIXUP_LENOVO_MULTI_CODECS, ALC294_FIXUP_LENOVO_MIC_LOCATION, ALC700_FIXUP_INTEL_REFERENCE, + ALC274_FIXUP_DELL_BIND_DACS, + ALC274_FIXUP_DELL_AIO_LINEOUT_VERB, }; static const struct hda_fixup alc269_fixups[] = { @@ -6089,6 +6107,21 @@ static const struct hda_fixup alc269_fixups[] = { {} } }, + [ALC274_FIXUP_DELL_BIND_DACS] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc274_fixup_bind_dacs, + .chained = true, + .chain_id = ALC269_FIXUP_DELL1_MIC_NO_PRESENCE + }, + [ALC274_FIXUP_DELL_AIO_LINEOUT_VERB] = { + .type = HDA_FIXUP_PINS, + .v.pins = (const struct hda_pintbl[]) { + { 0x1b, 0x0401102f }, + { } + }, + .chained = true, + .chain_id = ALC274_FIXUP_DELL_BIND_DACS + }, }; static const struct snd_pci_quirk alc269_fixup_tbl[] = { @@ -6550,7 +6583,7 @@ static const struct snd_hda_pin_quirk alc269_pin_fixup_tbl[] = { {0x14, 0x90170110}, {0x1b, 0x90a70130}, {0x21, 0x03211020}), - SND_HDA_PIN_QUIRK(0x10ec0274, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, + SND_HDA_PIN_QUIRK(0x10ec0274, 0x1028, "Dell", ALC274_FIXUP_DELL_AIO_LINEOUT_VERB, {0x12, 0xb7a60130}, {0x13, 0xb8a61140}, {0x16, 0x90170110}, diff --git a/sound/usb/mixer.c b/sound/usb/mixer.c index 4fde4f8d4444..75bce127d768 100644 --- a/sound/usb/mixer.c +++ b/sound/usb/mixer.c @@ -2173,20 +2173,25 @@ static int parse_audio_selector_unit(struct mixer_build *state, int unitid, kctl->private_value = (unsigned long)namelist; kctl->private_free = usb_mixer_selector_elem_free; - nameid = uac_selector_unit_iSelector(desc); + /* check the static mapping table at first */ len = check_mapped_name(map, kctl->id.name, sizeof(kctl->id.name)); - if (len) - ; - else if (nameid) - len = snd_usb_copy_string_desc(state, nameid, kctl->id.name, - sizeof(kctl->id.name)); - else - len = get_term_name(state, &state->oterm, - kctl->id.name, sizeof(kctl->id.name), 0); - if (!len) { - strlcpy(kctl->id.name, "USB", sizeof(kctl->id.name)); + /* no mapping ? */ + /* if iSelector is given, use it */ + nameid = uac_selector_unit_iSelector(desc); + if (nameid) + len = snd_usb_copy_string_desc(state, nameid, + kctl->id.name, + sizeof(kctl->id.name)); + /* ... or pick up the terminal name at next */ + if (!len) + len = get_term_name(state, &state->oterm, + kctl->id.name, sizeof(kctl->id.name), 0); + /* ... or use the fixed string "USB" as the last resort */ + if (!len) + strlcpy(kctl->id.name, "USB", sizeof(kctl->id.name)); + /* and add the proper suffix */ if (desc->bDescriptorSubtype == UAC2_CLOCK_SELECTOR) append_ctl_name(kctl, " Clock Source"); else if ((state->oterm.type & 0xff00) == 0x0100) diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index 20624320b753..8d7db7cd4f88 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1172,10 +1172,11 @@ static bool is_marantz_denon_dac(unsigned int id) /* TEAC UD-501/UD-503/NT-503 USB DACs need a vendor cmd to switch * between PCM/DOP and native DSD mode */ -static bool is_teac_50X_dac(unsigned int id) +static bool is_teac_dsd_dac(unsigned int id) { switch (id) { case USB_ID(0x0644, 0x8043): /* TEAC UD-501/UD-503/NT-503 */ + case USB_ID(0x0644, 0x8044): /* Esoteric D-05X */ return true; } return false; @@ -1208,7 +1209,7 @@ int snd_usb_select_mode_quirk(struct snd_usb_substream *subs, break; } mdelay(20); - } else if (is_teac_50X_dac(subs->stream->chip->usb_id)) { + } else if (is_teac_dsd_dac(subs->stream->chip->usb_id)) { /* Vendor mode switch cmd is required. */ switch (fmt->altsetting) { case 3: /* DSD mode (DSD_U32) requested */ @@ -1398,7 +1399,7 @@ u64 snd_usb_interface_dsd_format_quirks(struct snd_usb_audio *chip, } /* TEAC devices with USB DAC functionality */ - if (is_teac_50X_dac(chip->usb_id)) { + if (is_teac_dsd_dac(chip->usb_id)) { if (fp->altsetting == 3) return SNDRV_PCM_FMTBIT_DSD_U32_BE; } diff --git a/tools/objtool/.gitignore b/tools/objtool/.gitignore index d3102c865a95..914cff12899b 100644 --- a/tools/objtool/.gitignore +++ b/tools/objtool/.gitignore @@ -1,3 +1,3 @@ -arch/x86/insn/inat-tables.c +arch/x86/lib/inat-tables.c objtool fixdep diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile index 424b1965d06f..ae0272f9a091 100644 --- a/tools/objtool/Makefile +++ b/tools/objtool/Makefile @@ -7,9 +7,11 @@ ARCH := x86 endif # always use the host compiler -CC = gcc -LD = ld -AR = ar +HOSTCC ?= gcc +HOSTLD ?= ld +CC = $(HOSTCC) +LD = $(HOSTLD) +AR = ar ifeq ($(srctree),) srctree := $(patsubst %/,%,$(dir $(CURDIR))) @@ -25,7 +27,9 @@ OBJTOOL_IN := $(OBJTOOL)-in.o all: $(OBJTOOL) -INCLUDES := -I$(srctree)/tools/include -I$(srctree)/tools/arch/$(HOSTARCH)/include/uapi +INCLUDES := -I$(srctree)/tools/include \ + -I$(srctree)/tools/arch/$(HOSTARCH)/include/uapi \ + -I$(srctree)/tools/objtool/arch/$(ARCH)/include WARNINGS := $(EXTRA_WARNINGS) -Wno-switch-default -Wno-switch-enum -Wno-packed CFLAGS += -Wall -Werror $(WARNINGS) -fomit-frame-pointer -O2 -g $(INCLUDES) LDFLAGS += -lelf $(LIBSUBCMD) @@ -41,22 +45,8 @@ include $(srctree)/tools/build/Makefile.include $(OBJTOOL_IN): fixdep FORCE @$(MAKE) $(build)=objtool -# Busybox's diff doesn't have -I, avoid warning in that case -# $(OBJTOOL): $(LIBSUBCMD) $(OBJTOOL_IN) - @(diff -I 2>&1 | grep -q 'option requires an argument' && \ - test -d ../../kernel -a -d ../../tools -a -d ../objtool && (( \ - diff -I'^#include' arch/x86/insn/insn.c ../../arch/x86/lib/insn.c >/dev/null && \ - diff -I'^#include' arch/x86/insn/inat.c ../../arch/x86/lib/inat.c >/dev/null && \ - diff arch/x86/insn/x86-opcode-map.txt ../../arch/x86/lib/x86-opcode-map.txt >/dev/null && \ - diff arch/x86/insn/gen-insn-attr-x86.awk ../../arch/x86/tools/gen-insn-attr-x86.awk >/dev/null && \ - diff -I'^#include' arch/x86/insn/insn.h ../../arch/x86/include/asm/insn.h >/dev/null && \ - diff -I'^#include' arch/x86/insn/inat.h ../../arch/x86/include/asm/inat.h >/dev/null && \ - diff -I'^#include' arch/x86/insn/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) \ - || echo "warning: objtool: x86 instruction decoder differs from kernel" >&2 )) || true - @(test -d ../../kernel -a -d ../../tools -a -d ../objtool && (( \ - diff ../../arch/x86/include/asm/orc_types.h orc_types.h >/dev/null) \ - || echo "warning: objtool: orc_types.h differs from kernel" >&2 )) || true + @./sync-check.sh $(QUIET_LINK)$(CC) $(OBJTOOL_IN) $(LDFLAGS) -o $@ @@ -66,7 +56,7 @@ $(LIBSUBCMD): fixdep FORCE clean: $(call QUIET_CLEAN, objtool) $(RM) $(OBJTOOL) $(Q)find $(OUTPUT) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete - $(Q)$(RM) $(OUTPUT)arch/x86/insn/inat-tables.c $(OUTPUT)fixdep + $(Q)$(RM) $(OUTPUT)arch/x86/lib/inat-tables.c $(OUTPUT)fixdep FORCE: diff --git a/tools/objtool/arch/x86/Build b/tools/objtool/arch/x86/Build index debbdb0b5c43..b998412c017d 100644 --- a/tools/objtool/arch/x86/Build +++ b/tools/objtool/arch/x86/Build @@ -1,12 +1,12 @@ objtool-y += decode.o -inat_tables_script = arch/x86/insn/gen-insn-attr-x86.awk -inat_tables_maps = arch/x86/insn/x86-opcode-map.txt +inat_tables_script = arch/x86/tools/gen-insn-attr-x86.awk +inat_tables_maps = arch/x86/lib/x86-opcode-map.txt -$(OUTPUT)arch/x86/insn/inat-tables.c: $(inat_tables_script) $(inat_tables_maps) +$(OUTPUT)arch/x86/lib/inat-tables.c: $(inat_tables_script) $(inat_tables_maps) $(call rule_mkdir) $(Q)$(call echo-cmd,gen)$(AWK) -f $(inat_tables_script) $(inat_tables_maps) > $@ -$(OUTPUT)arch/x86/decode.o: $(OUTPUT)arch/x86/insn/inat-tables.c +$(OUTPUT)arch/x86/decode.o: $(OUTPUT)arch/x86/lib/inat-tables.c -CFLAGS_decode.o += -I$(OUTPUT)arch/x86/insn +CFLAGS_decode.o += -I$(OUTPUT)arch/x86/lib diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index 34a579f806e3..8acfc47af70e 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -19,9 +19,9 @@ #include <stdlib.h> #define unlikely(cond) (cond) -#include "insn/insn.h" -#include "insn/inat.c" -#include "insn/insn.c" +#include <asm/insn.h> +#include "lib/inat.c" +#include "lib/insn.c" #include "../../elf.h" #include "../../arch.h" diff --git a/tools/objtool/arch/x86/include/asm/inat.h b/tools/objtool/arch/x86/include/asm/inat.h new file mode 100644 index 000000000000..1c78580e58be --- /dev/null +++ b/tools/objtool/arch/x86/include/asm/inat.h @@ -0,0 +1,244 @@ +#ifndef _ASM_X86_INAT_H +#define _ASM_X86_INAT_H +/* + * x86 instruction attributes + * + * Written by Masami Hiramatsu <mhiramat(a)redhat.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + */ +#include <asm/inat_types.h> + +/* + * Internal bits. Don't use bitmasks directly, because these bits are + * unstable. You should use checking functions. + */ + +#define INAT_OPCODE_TABLE_SIZE 256 +#define INAT_GROUP_TABLE_SIZE 8 + +/* Legacy last prefixes */ +#define INAT_PFX_OPNDSZ 1 /* 0x66 */ /* LPFX1 */ +#define INAT_PFX_REPE 2 /* 0xF3 */ /* LPFX2 */ +#define INAT_PFX_REPNE 3 /* 0xF2 */ /* LPFX3 */ +/* Other Legacy prefixes */ +#define INAT_PFX_LOCK 4 /* 0xF0 */ +#define INAT_PFX_CS 5 /* 0x2E */ +#define INAT_PFX_DS 6 /* 0x3E */ +#define INAT_PFX_ES 7 /* 0x26 */ +#define INAT_PFX_FS 8 /* 0x64 */ +#define INAT_PFX_GS 9 /* 0x65 */ +#define INAT_PFX_SS 10 /* 0x36 */ +#define INAT_PFX_ADDRSZ 11 /* 0x67 */ +/* x86-64 REX prefix */ +#define INAT_PFX_REX 12 /* 0x4X */ +/* AVX VEX prefixes */ +#define INAT_PFX_VEX2 13 /* 2-bytes VEX prefix */ +#define INAT_PFX_VEX3 14 /* 3-bytes VEX prefix */ +#define INAT_PFX_EVEX 15 /* EVEX prefix */ + +#define INAT_LSTPFX_MAX 3 +#define INAT_LGCPFX_MAX 11 + +/* Immediate size */ +#define INAT_IMM_BYTE 1 +#define INAT_IMM_WORD 2 +#define INAT_IMM_DWORD 3 +#define INAT_IMM_QWORD 4 +#define INAT_IMM_PTR 5 +#define INAT_IMM_VWORD32 6 +#define INAT_IMM_VWORD 7 + +/* Legacy prefix */ +#define INAT_PFX_OFFS 0 +#define INAT_PFX_BITS 4 +#define INAT_PFX_MAX ((1 << INAT_PFX_BITS) - 1) +#define INAT_PFX_MASK (INAT_PFX_MAX << INAT_PFX_OFFS) +/* Escape opcodes */ +#define INAT_ESC_OFFS (INAT_PFX_OFFS + INAT_PFX_BITS) +#define INAT_ESC_BITS 2 +#define INAT_ESC_MAX ((1 << INAT_ESC_BITS) - 1) +#define INAT_ESC_MASK (INAT_ESC_MAX << INAT_ESC_OFFS) +/* Group opcodes (1-16) */ +#define INAT_GRP_OFFS (INAT_ESC_OFFS + INAT_ESC_BITS) +#define INAT_GRP_BITS 5 +#define INAT_GRP_MAX ((1 << INAT_GRP_BITS) - 1) +#define INAT_GRP_MASK (INAT_GRP_MAX << INAT_GRP_OFFS) +/* Immediates */ +#define INAT_IMM_OFFS (INAT_GRP_OFFS + INAT_GRP_BITS) +#define INAT_IMM_BITS 3 +#define INAT_IMM_MASK (((1 << INAT_IMM_BITS) - 1) << INAT_IMM_OFFS) +/* Flags */ +#define INAT_FLAG_OFFS (INAT_IMM_OFFS + INAT_IMM_BITS) +#define INAT_MODRM (1 << (INAT_FLAG_OFFS)) +#define INAT_FORCE64 (1 << (INAT_FLAG_OFFS + 1)) +#define INAT_SCNDIMM (1 << (INAT_FLAG_OFFS + 2)) +#define INAT_MOFFSET (1 << (INAT_FLAG_OFFS + 3)) +#define INAT_VARIANT (1 << (INAT_FLAG_OFFS + 4)) +#define INAT_VEXOK (1 << (INAT_FLAG_OFFS + 5)) +#define INAT_VEXONLY (1 << (INAT_FLAG_OFFS + 6)) +#define INAT_EVEXONLY (1 << (INAT_FLAG_OFFS + 7)) +/* Attribute making macros for attribute tables */ +#define INAT_MAKE_PREFIX(pfx) (pfx << INAT_PFX_OFFS) +#define INAT_MAKE_ESCAPE(esc) (esc << INAT_ESC_OFFS) +#define INAT_MAKE_GROUP(grp) ((grp << INAT_GRP_OFFS) | INAT_MODRM) +#define INAT_MAKE_IMM(imm) (imm << INAT_IMM_OFFS) + +/* Identifiers for segment registers */ +#define INAT_SEG_REG_IGNORE 0 +#define INAT_SEG_REG_DEFAULT 1 +#define INAT_SEG_REG_CS 2 +#define INAT_SEG_REG_SS 3 +#define INAT_SEG_REG_DS 4 +#define INAT_SEG_REG_ES 5 +#define INAT_SEG_REG_FS 6 +#define INAT_SEG_REG_GS 7 + +/* Attribute search APIs */ +extern insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode); +extern int inat_get_last_prefix_id(insn_byte_t last_pfx); +extern insn_attr_t inat_get_escape_attribute(insn_byte_t opcode, + int lpfx_id, + insn_attr_t esc_attr); +extern insn_attr_t inat_get_group_attribute(insn_byte_t modrm, + int lpfx_id, + insn_attr_t esc_attr); +extern insn_attr_t inat_get_avx_attribute(insn_byte_t opcode, + insn_byte_t vex_m, + insn_byte_t vex_pp); + +/* Attribute checking functions */ +static inline int inat_is_legacy_prefix(insn_attr_t attr) +{ + attr &= INAT_PFX_MASK; + return attr && attr <= INAT_LGCPFX_MAX; +} + +static inline int inat_is_address_size_prefix(insn_attr_t attr) +{ + return (attr & INAT_PFX_MASK) == INAT_PFX_ADDRSZ; +} + +static inline int inat_is_operand_size_prefix(insn_attr_t attr) +{ + return (attr & INAT_PFX_MASK) == INAT_PFX_OPNDSZ; +} + +static inline int inat_is_rex_prefix(insn_attr_t attr) +{ + return (attr & INAT_PFX_MASK) == INAT_PFX_REX; +} + +static inline int inat_last_prefix_id(insn_attr_t attr) +{ + if ((attr & INAT_PFX_MASK) > INAT_LSTPFX_MAX) + return 0; + else + return attr & INAT_PFX_MASK; +} + +static inline int inat_is_vex_prefix(insn_attr_t attr) +{ + attr &= INAT_PFX_MASK; + return attr == INAT_PFX_VEX2 || attr == INAT_PFX_VEX3 || + attr == INAT_PFX_EVEX; +} + +static inline int inat_is_evex_prefix(insn_attr_t attr) +{ + return (attr & INAT_PFX_MASK) == INAT_PFX_EVEX; +} + +static inline int inat_is_vex3_prefix(insn_attr_t attr) +{ + return (attr & INAT_PFX_MASK) == INAT_PFX_VEX3; +} + +static inline int inat_is_escape(insn_attr_t attr) +{ + return attr & INAT_ESC_MASK; +} + +static inline int inat_escape_id(insn_attr_t attr) +{ + return (attr & INAT_ESC_MASK) >> INAT_ESC_OFFS; +} + +static inline int inat_is_group(insn_attr_t attr) +{ + return attr & INAT_GRP_MASK; +} + +static inline int inat_group_id(insn_attr_t attr) +{ + return (attr & INAT_GRP_MASK) >> INAT_GRP_OFFS; +} + +static inline int inat_group_common_attribute(insn_attr_t attr) +{ + return attr & ~INAT_GRP_MASK; +} + +static inline int inat_has_immediate(insn_attr_t attr) +{ + return attr & INAT_IMM_MASK; +} + +static inline int inat_immediate_size(insn_attr_t attr) +{ + return (attr & INAT_IMM_MASK) >> INAT_IMM_OFFS; +} + +static inline int inat_has_modrm(insn_attr_t attr) +{ + return attr & INAT_MODRM; +} + +static inline int inat_is_force64(insn_attr_t attr) +{ + return attr & INAT_FORCE64; +} + +static inline int inat_has_second_immediate(insn_attr_t attr) +{ + return attr & INAT_SCNDIMM; +} + +static inline int inat_has_moffset(insn_attr_t attr) +{ + return attr & INAT_MOFFSET; +} + +static inline int inat_has_variant(insn_attr_t attr) +{ + return attr & INAT_VARIANT; +} + +static inline int inat_accept_vex(insn_attr_t attr) +{ + return attr & INAT_VEXOK; +} + +static inline int inat_must_vex(insn_attr_t attr) +{ + return attr & (INAT_VEXONLY | INAT_EVEXONLY); +} + +static inline int inat_must_evex(insn_attr_t attr) +{ + return attr & INAT_EVEXONLY; +} +#endif diff --git a/tools/objtool/arch/x86/include/asm/inat_types.h b/tools/objtool/arch/x86/include/asm/inat_types.h new file mode 100644 index 000000000000..cb3c20ce39cf --- /dev/null +++ b/tools/objtool/arch/x86/include/asm/inat_types.h @@ -0,0 +1,29 @@ +#ifndef _ASM_X86_INAT_TYPES_H +#define _ASM_X86_INAT_TYPES_H +/* + * x86 instruction attributes + * + * Written by Masami Hiramatsu <mhiramat(a)redhat.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + */ + +/* Instruction attributes */ +typedef unsigned int insn_attr_t; +typedef unsigned char insn_byte_t; +typedef signed int insn_value_t; + +#endif diff --git a/tools/objtool/arch/x86/include/asm/insn.h b/tools/objtool/arch/x86/include/asm/insn.h new file mode 100644 index 000000000000..b3e32b010ab1 --- /dev/null +++ b/tools/objtool/arch/x86/include/asm/insn.h @@ -0,0 +1,211 @@ +#ifndef _ASM_X86_INSN_H +#define _ASM_X86_INSN_H +/* + * x86 instruction analysis + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2009 + */ + +/* insn_attr_t is defined in inat.h */ +#include <asm/inat.h> + +struct insn_field { + union { + insn_value_t value; + insn_byte_t bytes[4]; + }; + /* !0 if we've run insn_get_xxx() for this field */ + unsigned char got; + unsigned char nbytes; +}; + +struct insn { + struct insn_field prefixes; /* + * Prefixes + * prefixes.bytes[3]: last prefix + */ + struct insn_field rex_prefix; /* REX prefix */ + struct insn_field vex_prefix; /* VEX prefix */ + struct insn_field opcode; /* + * opcode.bytes[0]: opcode1 + * opcode.bytes[1]: opcode2 + * opcode.bytes[2]: opcode3 + */ + struct insn_field modrm; + struct insn_field sib; + struct insn_field displacement; + union { + struct insn_field immediate; + struct insn_field moffset1; /* for 64bit MOV */ + struct insn_field immediate1; /* for 64bit imm or off16/32 */ + }; + union { + struct insn_field moffset2; /* for 64bit MOV */ + struct insn_field immediate2; /* for 64bit imm or seg16 */ + }; + + insn_attr_t attr; + unsigned char opnd_bytes; + unsigned char addr_bytes; + unsigned char length; + unsigned char x86_64; + + const insn_byte_t *kaddr; /* kernel address of insn to analyze */ + const insn_byte_t *end_kaddr; /* kernel address of last insn in buffer */ + const insn_byte_t *next_byte; +}; + +#define MAX_INSN_SIZE 15 + +#define X86_MODRM_MOD(modrm) (((modrm) & 0xc0) >> 6) +#define X86_MODRM_REG(modrm) (((modrm) & 0x38) >> 3) +#define X86_MODRM_RM(modrm) ((modrm) & 0x07) + +#define X86_SIB_SCALE(sib) (((sib) & 0xc0) >> 6) +#define X86_SIB_INDEX(sib) (((sib) & 0x38) >> 3) +#define X86_SIB_BASE(sib) ((sib) & 0x07) + +#define X86_REX_W(rex) ((rex) & 8) +#define X86_REX_R(rex) ((rex) & 4) +#define X86_REX_X(rex) ((rex) & 2) +#define X86_REX_B(rex) ((rex) & 1) + +/* VEX bit flags */ +#define X86_VEX_W(vex) ((vex) & 0x80) /* VEX3 Byte2 */ +#define X86_VEX_R(vex) ((vex) & 0x80) /* VEX2/3 Byte1 */ +#define X86_VEX_X(vex) ((vex) & 0x40) /* VEX3 Byte1 */ +#define X86_VEX_B(vex) ((vex) & 0x20) /* VEX3 Byte1 */ +#define X86_VEX_L(vex) ((vex) & 0x04) /* VEX3 Byte2, VEX2 Byte1 */ +/* VEX bit fields */ +#define X86_EVEX_M(vex) ((vex) & 0x03) /* EVEX Byte1 */ +#define X86_VEX3_M(vex) ((vex) & 0x1f) /* VEX3 Byte1 */ +#define X86_VEX2_M 1 /* VEX2.M always 1 */ +#define X86_VEX_V(vex) (((vex) & 0x78) >> 3) /* VEX3 Byte2, VEX2 Byte1 */ +#define X86_VEX_P(vex) ((vex) & 0x03) /* VEX3 Byte2, VEX2 Byte1 */ +#define X86_VEX_M_MAX 0x1f /* VEX3.M Maximum value */ + +extern void insn_init(struct insn *insn, const void *kaddr, int buf_len, int x86_64); +extern void insn_get_prefixes(struct insn *insn); +extern void insn_get_opcode(struct insn *insn); +extern void insn_get_modrm(struct insn *insn); +extern void insn_get_sib(struct insn *insn); +extern void insn_get_displacement(struct insn *insn); +extern void insn_get_immediate(struct insn *insn); +extern void insn_get_length(struct insn *insn); + +/* Attribute will be determined after getting ModRM (for opcode groups) */ +static inline void insn_get_attribute(struct insn *insn) +{ + insn_get_modrm(insn); +} + +/* Instruction uses RIP-relative addressing */ +extern int insn_rip_relative(struct insn *insn); + +/* Init insn for kernel text */ +static inline void kernel_insn_init(struct insn *insn, + const void *kaddr, int buf_len) +{ +#ifdef CONFIG_X86_64 + insn_init(insn, kaddr, buf_len, 1); +#else /* CONFIG_X86_32 */ + insn_init(insn, kaddr, buf_len, 0); +#endif +} + +static inline int insn_is_avx(struct insn *insn) +{ + if (!insn->prefixes.got) + insn_get_prefixes(insn); + return (insn->vex_prefix.value != 0); +} + +static inline int insn_is_evex(struct insn *insn) +{ + if (!insn->prefixes.got) + insn_get_prefixes(insn); + return (insn->vex_prefix.nbytes == 4); +} + +/* Ensure this instruction is decoded completely */ +static inline int insn_complete(struct insn *insn) +{ + return insn->opcode.got && insn->modrm.got && insn->sib.got && + insn->displacement.got && insn->immediate.got; +} + +static inline insn_byte_t insn_vex_m_bits(struct insn *insn) +{ + if (insn->vex_prefix.nbytes == 2) /* 2 bytes VEX */ + return X86_VEX2_M; + else if (insn->vex_prefix.nbytes == 3) /* 3 bytes VEX */ + return X86_VEX3_M(insn->vex_prefix.bytes[1]); + else /* EVEX */ + return X86_EVEX_M(insn->vex_prefix.bytes[1]); +} + +static inline insn_byte_t insn_vex_p_bits(struct insn *insn) +{ + if (insn->vex_prefix.nbytes == 2) /* 2 bytes VEX */ + return X86_VEX_P(insn->vex_prefix.bytes[1]); + else + return X86_VEX_P(insn->vex_prefix.bytes[2]); +} + +/* Get the last prefix id from last prefix or VEX prefix */ +static inline int insn_last_prefix_id(struct insn *insn) +{ + if (insn_is_avx(insn)) + return insn_vex_p_bits(insn); /* VEX_p is a SIMD prefix id */ + + if (insn->prefixes.bytes[3]) + return inat_get_last_prefix_id(insn->prefixes.bytes[3]); + + return 0; +} + +/* Offset of each field from kaddr */ +static inline int insn_offset_rex_prefix(struct insn *insn) +{ + return insn->prefixes.nbytes; +} +static inline int insn_offset_vex_prefix(struct insn *insn) +{ + return insn_offset_rex_prefix(insn) + insn->rex_prefix.nbytes; +} +static inline int insn_offset_opcode(struct insn *insn) +{ + return insn_offset_vex_prefix(insn) + insn->vex_prefix.nbytes; +} +static inline int insn_offset_modrm(struct insn *insn) +{ + return insn_offset_opcode(insn) + insn->opcode.nbytes; +} +static inline int insn_offset_sib(struct insn *insn) +{ + return insn_offset_modrm(insn) + insn->modrm.nbytes; +} +static inline int insn_offset_displacement(struct insn *insn) +{ + return insn_offset_sib(insn) + insn->sib.nbytes; +} +static inline int insn_offset_immediate(struct insn *insn) +{ + return insn_offset_displacement(insn) + insn->displacement.nbytes; +} + +#endif /* _ASM_X86_INSN_H */ diff --git a/tools/objtool/arch/x86/include/asm/orc_types.h b/tools/objtool/arch/x86/include/asm/orc_types.h new file mode 100644 index 000000000000..9c9dc579bd7d --- /dev/null +++ b/tools/objtool/arch/x86/include/asm/orc_types.h @@ -0,0 +1,107 @@ +/* + * Copyright (C) 2017 Josh Poimboeuf <jpoimboe(a)redhat.com> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef _ORC_TYPES_H +#define _ORC_TYPES_H + +#include <linux/types.h> +#include <linux/compiler.h> + +/* + * The ORC_REG_* registers are base registers which are used to find other + * registers on the stack. + * + * ORC_REG_PREV_SP, also known as DWARF Call Frame Address (CFA), is the + * address of the previous frame: the caller's SP before it called the current + * function. + * + * ORC_REG_UNDEFINED means the corresponding register's value didn't change in + * the current frame. + * + * The most commonly used base registers are SP and BP -- which the previous SP + * is usually based on -- and PREV_SP and UNDEFINED -- which the previous BP is + * usually based on. + * + * The rest of the base registers are needed for special cases like entry code + * and GCC realigned stacks. + */ +#define ORC_REG_UNDEFINED 0 +#define ORC_REG_PREV_SP 1 +#define ORC_REG_DX 2 +#define ORC_REG_DI 3 +#define ORC_REG_BP 4 +#define ORC_REG_SP 5 +#define ORC_REG_R10 6 +#define ORC_REG_R13 7 +#define ORC_REG_BP_INDIRECT 8 +#define ORC_REG_SP_INDIRECT 9 +#define ORC_REG_MAX 15 + +/* + * ORC_TYPE_CALL: Indicates that sp_reg+sp_offset resolves to PREV_SP (the + * caller's SP right before it made the call). Used for all callable + * functions, i.e. all C code and all callable asm functions. + * + * ORC_TYPE_REGS: Used in entry code to indicate that sp_reg+sp_offset points + * to a fully populated pt_regs from a syscall, interrupt, or exception. + * + * ORC_TYPE_REGS_IRET: Used in entry code to indicate that sp_reg+sp_offset + * points to the iret return frame. + * + * The UNWIND_HINT macros are used only for the unwind_hint struct. They + * aren't used in struct orc_entry due to size and complexity constraints. + * Objtool converts them to real types when it converts the hints to orc + * entries. + */ +#define ORC_TYPE_CALL 0 +#define ORC_TYPE_REGS 1 +#define ORC_TYPE_REGS_IRET 2 +#define UNWIND_HINT_TYPE_SAVE 3 +#define UNWIND_HINT_TYPE_RESTORE 4 + +#ifndef __ASSEMBLY__ +/* + * This struct is more or less a vastly simplified version of the DWARF Call + * Frame Information standard. It contains only the necessary parts of DWARF + * CFI, simplified for ease of access by the in-kernel unwinder. It tells the + * unwinder how to find the previous SP and BP (and sometimes entry regs) on + * the stack for a given code address. Each instance of the struct corresponds + * to one or more code locations. + */ +struct orc_entry { + s16 sp_offset; + s16 bp_offset; + unsigned sp_reg:4; + unsigned bp_reg:4; + unsigned type:2; +} __packed; + +/* + * This struct is used by asm and inline asm code to manually annotate the + * location of registers on the stack for the ORC unwinder. + * + * Type can be either ORC_TYPE_* or UNWIND_HINT_TYPE_*. + */ +struct unwind_hint { + u32 ip; + s16 sp_offset; + u8 sp_reg; + u8 type; +}; +#endif /* __ASSEMBLY__ */ + +#endif /* _ORC_TYPES_H */ diff --git a/tools/objtool/arch/x86/insn/gen-insn-attr-x86.awk b/tools/objtool/arch/x86/insn/gen-insn-attr-x86.awk deleted file mode 100644 index b02a36b2c14f..000000000000 --- a/tools/objtool/arch/x86/insn/gen-insn-attr-x86.awk +++ /dev/null @@ -1,393 +0,0 @@ -#!/bin/awk -f -# SPDX-License-Identifier: GPL-2.0 -# gen-insn-attr-x86.awk: Instruction attribute table generator -# Written by Masami Hiramatsu <mhiramat(a)redhat.com> -# -# Usage: awk -f gen-insn-attr-x86.awk x86-opcode-map.txt > inat-tables.c - -# Awk implementation sanity check -function check_awk_implement() { - if (sprintf("%x", 0) != "0") - return "Your awk has a printf-format problem." - return "" -} - -# Clear working vars -function clear_vars() { - delete table - delete lptable2 - delete lptable1 - delete lptable3 - eid = -1 # escape id - gid = -1 # group id - aid = -1 # AVX id - tname = "" -} - -BEGIN { - # Implementation error checking - awkchecked = check_awk_implement() - if (awkchecked != "") { - print "Error: " awkchecked > "/dev/stderr" - print "Please try to use gawk." > "/dev/stderr" - exit 1 - } - - # Setup generating tables - print "/* x86 opcode map generated from x86-opcode-map.txt */" - print "/* Do not change this code. */\n" - ggid = 1 - geid = 1 - gaid = 0 - delete etable - delete gtable - delete atable - - opnd_expr = "^[A-Za-z/]" - ext_expr = "^\$" - sep_expr = "^\\|$" - group_expr = "^Grp[0-9A-Za-z]+" - - imm_expr = "^[IJAOL][a-z]" - imm_flag["Ib"] = "INAT_MAKE_IMM(INAT_IMM_BYTE)" - imm_flag["Jb"] = "INAT_MAKE_IMM(INAT_IMM_BYTE)" - imm_flag["Iw"] = "INAT_MAKE_IMM(INAT_IMM_WORD)" - imm_flag["Id"] = "INAT_MAKE_IMM(INAT_IMM_DWORD)" - imm_flag["Iq"] = "INAT_MAKE_IMM(INAT_IMM_QWORD)" - imm_flag["Ap"] = "INAT_MAKE_IMM(INAT_IMM_PTR)" - imm_flag["Iz"] = "INAT_MAKE_IMM(INAT_IMM_VWORD32)" - imm_flag["Jz"] = "INAT_MAKE_IMM(INAT_IMM_VWORD32)" - imm_flag["Iv"] = "INAT_MAKE_IMM(INAT_IMM_VWORD)" - imm_flag["Ob"] = "INAT_MOFFSET" - imm_flag["Ov"] = "INAT_MOFFSET" - imm_flag["Lx"] = "INAT_MAKE_IMM(INAT_IMM_BYTE)" - - modrm_expr = "^([CDEGMNPQRSUVW/][a-z]+|NTA|T[012])" - force64_expr = "\\([df]64\$" - rex_expr = "^REX(\\.[XRWB]+)*" - fpu_expr = "^ESC" # TODO - - lprefix1_expr = "\$(66|!F3)\$" - lprefix2_expr = "\$F3\$" - lprefix3_expr = "\$(F2|!F3|66\\&F2)\$" - lprefix_expr = "\$(66|F2|F3)\$" - max_lprefix = 4 - - # All opcodes starting with lower-case 'v', 'k' or with (v1) superscript - # accepts VEX prefix - vexok_opcode_expr = "^[vk].*" - vexok_expr = "\$v1\$" - # All opcodes with (v) superscript supports *only* VEX prefix - vexonly_expr = "\$v\$" - # All opcodes with (ev) superscript supports *only* EVEX prefix - evexonly_expr = "\$ev\$" - - prefix_expr = "\$Prefix\$" - prefix_num["Operand-Size"] = "INAT_PFX_OPNDSZ" - prefix_num["REPNE"] = "INAT_PFX_REPNE" - prefix_num["REP/REPE"] = "INAT_PFX_REPE" - prefix_num["XACQUIRE"] = "INAT_PFX_REPNE" - prefix_num["XRELEASE"] = "INAT_PFX_REPE" - prefix_num["LOCK"] = "INAT_PFX_LOCK" - prefix_num["SEG=CS"] = "INAT_PFX_CS" - prefix_num["SEG=DS"] = "INAT_PFX_DS" - prefix_num["SEG=ES"] = "INAT_PFX_ES" - prefix_num["SEG=FS"] = "INAT_PFX_FS" - prefix_num["SEG=GS"] = "INAT_PFX_GS" - prefix_num["SEG=SS"] = "INAT_PFX_SS" - prefix_num["Address-Size"] = "INAT_PFX_ADDRSZ" - prefix_num["VEX+1byte"] = "INAT_PFX_VEX2" - prefix_num["VEX+2byte"] = "INAT_PFX_VEX3" - prefix_num["EVEX"] = "INAT_PFX_EVEX" - - clear_vars() -} - -function semantic_error(msg) { - print "Semantic error at " NR ": " msg > "/dev/stderr" - exit 1 -} - -function debug(msg) { - print "DEBUG: " msg -} - -function array_size(arr, i,c) { - c = 0 - for (i in arr) - c++ - return c -} - -/^Table:/ { - print "/* " $0 " */" - if (tname != "") - semantic_error("Hit Table: before EndTable:."); -} - -/^Referrer:/ { - if (NF != 1) { - # escape opcode table - ref = "" - for (i = 2; i <= NF; i++) - ref = ref $i - eid = escape[ref] - tname = sprintf("inat_escape_table_%d", eid) - } -} - -/^AVXcode:/ { - if (NF != 1) { - # AVX/escape opcode table - aid = $2 - if (gaid <= aid) - gaid = aid + 1 - if (tname == "") # AVX only opcode table - tname = sprintf("inat_avx_table_%d", $2) - } - if (aid == -1 && eid == -1) # primary opcode table - tname = "inat_primary_table" -} - -/^GrpTable:/ { - print "/* " $0 " */" - if (!($2 in group)) - semantic_error("No group: " $2 ) - gid = group[$2] - tname = "inat_group_table_" gid -} - -function print_table(tbl,name,fmt,n) -{ - print "const insn_attr_t " name " = {" - for (i = 0; i < n; i++) { - id = sprintf(fmt, i) - if (tbl[id]) - print " [" id "] = " tbl[id] "," - } - print "};" -} - -/^EndTable/ { - if (gid != -1) { - # print group tables - if (array_size(table) != 0) { - print_table(table, tname "[INAT_GROUP_TABLE_SIZE]", - "0x%x", 8) - gtable[gid,0] = tname - } - if (array_size(lptable1) != 0) { - print_table(lptable1, tname "_1[INAT_GROUP_TABLE_SIZE]", - "0x%x", 8) - gtable[gid,1] = tname "_1" - } - if (array_size(lptable2) != 0) { - print_table(lptable2, tname "_2[INAT_GROUP_TABLE_SIZE]", - "0x%x", 8) - gtable[gid,2] = tname "_2" - } - if (array_size(lptable3) != 0) { - print_table(lptable3, tname "_3[INAT_GROUP_TABLE_SIZE]", - "0x%x", 8) - gtable[gid,3] = tname "_3" - } - } else { - # print primary/escaped tables - if (array_size(table) != 0) { - print_table(table, tname "[INAT_OPCODE_TABLE_SIZE]", - "0x%02x", 256) - etable[eid,0] = tname - if (aid >= 0) - atable[aid,0] = tname - } - if (array_size(lptable1) != 0) { - print_table(lptable1,tname "_1[INAT_OPCODE_TABLE_SIZE]", - "0x%02x", 256) - etable[eid,1] = tname "_1" - if (aid >= 0) - atable[aid,1] = tname "_1" - } - if (array_size(lptable2) != 0) { - print_table(lptable2,tname "_2[INAT_OPCODE_TABLE_SIZE]", - "0x%02x", 256) - etable[eid,2] = tname "_2" - if (aid >= 0) - atable[aid,2] = tname "_2" - } - if (array_size(lptable3) != 0) { - print_table(lptable3,tname "_3[INAT_OPCODE_TABLE_SIZE]", - "0x%02x", 256) - etable[eid,3] = tname "_3" - if (aid >= 0) - atable[aid,3] = tname "_3" - } - } - print "" - clear_vars() -} - -function add_flags(old,new) { - if (old && new) - return old " | " new - else if (old) - return old - else - return new -} - -# convert operands to flags. -function convert_operands(count,opnd, i,j,imm,mod) -{ - imm = null - mod = null - for (j = 1; j <= count; j++) { - i = opnd[j] - if (match(i, imm_expr) == 1) { - if (!imm_flag[i]) - semantic_error("Unknown imm opnd: " i) - if (imm) { - if (i != "Ib") - semantic_error("Second IMM error") - imm = add_flags(imm, "INAT_SCNDIMM") - } else - imm = imm_flag[i] - } else if (match(i, modrm_expr)) - mod = "INAT_MODRM" - } - return add_flags(imm, mod) -} - -/^[0-9a-f]+\:/ { - if (NR == 1) - next - # get index - idx = "0x" substr($1, 1, index($1,":") - 1) - if (idx in table) - semantic_error("Redefine " idx " in " tname) - - # check if escaped opcode - if ("escape" == $2) { - if ($3 != "#") - semantic_error("No escaped name") - ref = "" - for (i = 4; i <= NF; i++) - ref = ref $i - if (ref in escape) - semantic_error("Redefine escape (" ref ")") - escape[ref] = geid - geid++ - table[idx] = "INAT_MAKE_ESCAPE(" escape[ref] ")" - next - } - - variant = null - # converts - i = 2 - while (i <= NF) { - opcode = $(i++) - delete opnds - ext = null - flags = null - opnd = null - # parse one opcode - if (match($i, opnd_expr)) { - opnd = $i - count = split($(i++), opnds, ",") - flags = convert_operands(count, opnds) - } - if (match($i, ext_expr)) - ext = $(i++) - if (match($i, sep_expr)) - i++ - else if (i < NF) - semantic_error($i " is not a separator") - - # check if group opcode - if (match(opcode, group_expr)) { - if (!(opcode in group)) { - group[opcode] = ggid - ggid++ - } - flags = add_flags(flags, "INAT_MAKE_GROUP(" group[opcode] ")") - } - # check force(or default) 64bit - if (match(ext, force64_expr)) - flags = add_flags(flags, "INAT_FORCE64") - - # check REX prefix - if (match(opcode, rex_expr)) - flags = add_flags(flags, "INAT_MAKE_PREFIX(INAT_PFX_REX)") - - # check coprocessor escape : TODO - if (match(opcode, fpu_expr)) - flags = add_flags(flags, "INAT_MODRM") - - # check VEX codes - if (match(ext, evexonly_expr)) - flags = add_flags(flags, "INAT_VEXOK | INAT_EVEXONLY") - else if (match(ext, vexonly_expr)) - flags = add_flags(flags, "INAT_VEXOK | INAT_VEXONLY") - else if (match(ext, vexok_expr) || match(opcode, vexok_opcode_expr)) - flags = add_flags(flags, "INAT_VEXOK") - - # check prefixes - if (match(ext, prefix_expr)) { - if (!prefix_num[opcode]) - semantic_error("Unknown prefix: " opcode) - flags = add_flags(flags, "INAT_MAKE_PREFIX(" prefix_num[opcode] ")") - } - if (length(flags) == 0) - continue - # check if last prefix - if (match(ext, lprefix1_expr)) { - lptable1[idx] = add_flags(lptable1[idx],flags) - variant = "INAT_VARIANT" - } - if (match(ext, lprefix2_expr)) { - lptable2[idx] = add_flags(lptable2[idx],flags) - variant = "INAT_VARIANT" - } - if (match(ext, lprefix3_expr)) { - lptable3[idx] = add_flags(lptable3[idx],flags) - variant = "INAT_VARIANT" - } - if (!match(ext, lprefix_expr)){ - table[idx] = add_flags(table[idx],flags) - } - } - if (variant) - table[idx] = add_flags(table[idx],variant) -} - -END { - if (awkchecked != "") - exit 1 - # print escape opcode map's array - print "/* Escape opcode map array */" - print "const insn_attr_t * const inat_escape_tables[INAT_ESC_MAX + 1]" \ - "[INAT_LSTPFX_MAX + 1] = {" - for (i = 0; i < geid; i++) - for (j = 0; j < max_lprefix; j++) - if (etable[i,j]) - print " ["i"]["j"] = "etable[i,j]"," - print "};\n" - # print group opcode map's array - print "/* Group opcode map array */" - print "const insn_attr_t * const inat_group_tables[INAT_GRP_MAX + 1]"\ - "[INAT_LSTPFX_MAX + 1] = {" - for (i = 0; i < ggid; i++) - for (j = 0; j < max_lprefix; j++) - if (gtable[i,j]) - print " ["i"]["j"] = "gtable[i,j]"," - print "};\n" - # print AVX opcode map's array - print "/* AVX opcode map array */" - print "const insn_attr_t * const inat_avx_tables[X86_VEX_M_MAX + 1]"\ - "[INAT_LSTPFX_MAX + 1] = {" - for (i = 0; i < gaid; i++) - for (j = 0; j < max_lprefix; j++) - if (atable[i,j]) - print " ["i"]["j"] = "atable[i,j]"," - print "};" -} - diff --git a/tools/objtool/arch/x86/insn/inat.c b/tools/objtool/arch/x86/insn/inat.c deleted file mode 100644 index e4bf28e6f4c7..000000000000 --- a/tools/objtool/arch/x86/insn/inat.c +++ /dev/null @@ -1,97 +0,0 @@ -/* - * x86 instruction attribute tables - * - * Written by Masami Hiramatsu <mhiramat(a)redhat.com> - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. - * - */ -#include "insn.h" - -/* Attribute tables are generated from opcode map */ -#include "inat-tables.c" - -/* Attribute search APIs */ -insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode) -{ - return inat_primary_table[opcode]; -} - -int inat_get_last_prefix_id(insn_byte_t last_pfx) -{ - insn_attr_t lpfx_attr; - - lpfx_attr = inat_get_opcode_attribute(last_pfx); - return inat_last_prefix_id(lpfx_attr); -} - -insn_attr_t inat_get_escape_attribute(insn_byte_t opcode, int lpfx_id, - insn_attr_t esc_attr) -{ - const insn_attr_t *table; - int n; - - n = inat_escape_id(esc_attr); - - table = inat_escape_tables[n][0]; - if (!table) - return 0; - if (inat_has_variant(table[opcode]) && lpfx_id) { - table = inat_escape_tables[n][lpfx_id]; - if (!table) - return 0; - } - return table[opcode]; -} - -insn_attr_t inat_get_group_attribute(insn_byte_t modrm, int lpfx_id, - insn_attr_t grp_attr) -{ - const insn_attr_t *table; - int n; - - n = inat_group_id(grp_attr); - - table = inat_group_tables[n][0]; - if (!table) - return inat_group_common_attribute(grp_attr); - if (inat_has_variant(table[X86_MODRM_REG(modrm)]) && lpfx_id) { - table = inat_group_tables[n][lpfx_id]; - if (!table) - return inat_group_common_attribute(grp_attr); - } - return table[X86_MODRM_REG(modrm)] | - inat_group_common_attribute(grp_attr); -} - -insn_attr_t inat_get_avx_attribute(insn_byte_t opcode, insn_byte_t vex_m, - insn_byte_t vex_p) -{ - const insn_attr_t *table; - if (vex_m > X86_VEX_M_MAX || vex_p > INAT_LSTPFX_MAX) - return 0; - /* At first, this checks the master table */ - table = inat_avx_tables[vex_m][0]; - if (!table) - return 0; - if (!inat_is_group(table[opcode]) && vex_p) { - /* If this is not a group, get attribute directly */ - table = inat_avx_tables[vex_m][vex_p]; - if (!table) - return 0; - } - return table[opcode]; -} - diff --git a/tools/objtool/arch/x86/insn/inat.h b/tools/objtool/arch/x86/insn/inat.h deleted file mode 100644 index 125ecd2a300d..000000000000 --- a/tools/objtool/arch/x86/insn/inat.h +++ /dev/null @@ -1,234 +0,0 @@ -#ifndef _ASM_X86_INAT_H -#define _ASM_X86_INAT_H -/* - * x86 instruction attributes - * - * Written by Masami Hiramatsu <mhiramat(a)redhat.com> - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. - * - */ -#include "inat_types.h" - -/* - * Internal bits. Don't use bitmasks directly, because these bits are - * unstable. You should use checking functions. - */ - -#define INAT_OPCODE_TABLE_SIZE 256 -#define INAT_GROUP_TABLE_SIZE 8 - -/* Legacy last prefixes */ -#define INAT_PFX_OPNDSZ 1 /* 0x66 */ /* LPFX1 */ -#define INAT_PFX_REPE 2 /* 0xF3 */ /* LPFX2 */ -#define INAT_PFX_REPNE 3 /* 0xF2 */ /* LPFX3 */ -/* Other Legacy prefixes */ -#define INAT_PFX_LOCK 4 /* 0xF0 */ -#define INAT_PFX_CS 5 /* 0x2E */ -#define INAT_PFX_DS 6 /* 0x3E */ -#define INAT_PFX_ES 7 /* 0x26 */ -#define INAT_PFX_FS 8 /* 0x64 */ -#define INAT_PFX_GS 9 /* 0x65 */ -#define INAT_PFX_SS 10 /* 0x36 */ -#define INAT_PFX_ADDRSZ 11 /* 0x67 */ -/* x86-64 REX prefix */ -#define INAT_PFX_REX 12 /* 0x4X */ -/* AVX VEX prefixes */ -#define INAT_PFX_VEX2 13 /* 2-bytes VEX prefix */ -#define INAT_PFX_VEX3 14 /* 3-bytes VEX prefix */ -#define INAT_PFX_EVEX 15 /* EVEX prefix */ - -#define INAT_LSTPFX_MAX 3 -#define INAT_LGCPFX_MAX 11 - -/* Immediate size */ -#define INAT_IMM_BYTE 1 -#define INAT_IMM_WORD 2 -#define INAT_IMM_DWORD 3 -#define INAT_IMM_QWORD 4 -#define INAT_IMM_PTR 5 -#define INAT_IMM_VWORD32 6 -#define INAT_IMM_VWORD 7 - -/* Legacy prefix */ -#define INAT_PFX_OFFS 0 -#define INAT_PFX_BITS 4 -#define INAT_PFX_MAX ((1 << INAT_PFX_BITS) - 1) -#define INAT_PFX_MASK (INAT_PFX_MAX << INAT_PFX_OFFS) -/* Escape opcodes */ -#define INAT_ESC_OFFS (INAT_PFX_OFFS + INAT_PFX_BITS) -#define INAT_ESC_BITS 2 -#define INAT_ESC_MAX ((1 << INAT_ESC_BITS) - 1) -#define INAT_ESC_MASK (INAT_ESC_MAX << INAT_ESC_OFFS) -/* Group opcodes (1-16) */ -#define INAT_GRP_OFFS (INAT_ESC_OFFS + INAT_ESC_BITS) -#define INAT_GRP_BITS 5 -#define INAT_GRP_MAX ((1 << INAT_GRP_BITS) - 1) -#define INAT_GRP_MASK (INAT_GRP_MAX << INAT_GRP_OFFS) -/* Immediates */ -#define INAT_IMM_OFFS (INAT_GRP_OFFS + INAT_GRP_BITS) -#define INAT_IMM_BITS 3 -#define INAT_IMM_MASK (((1 << INAT_IMM_BITS) - 1) << INAT_IMM_OFFS) -/* Flags */ -#define INAT_FLAG_OFFS (INAT_IMM_OFFS + INAT_IMM_BITS) -#define INAT_MODRM (1 << (INAT_FLAG_OFFS)) -#define INAT_FORCE64 (1 << (INAT_FLAG_OFFS + 1)) -#define INAT_SCNDIMM (1 << (INAT_FLAG_OFFS + 2)) -#define INAT_MOFFSET (1 << (INAT_FLAG_OFFS + 3)) -#define INAT_VARIANT (1 << (INAT_FLAG_OFFS + 4)) -#define INAT_VEXOK (1 << (INAT_FLAG_OFFS + 5)) -#define INAT_VEXONLY (1 << (INAT_FLAG_OFFS + 6)) -#define INAT_EVEXONLY (1 << (INAT_FLAG_OFFS + 7)) -/* Attribute making macros for attribute tables */ -#define INAT_MAKE_PREFIX(pfx) (pfx << INAT_PFX_OFFS) -#define INAT_MAKE_ESCAPE(esc) (esc << INAT_ESC_OFFS) -#define INAT_MAKE_GROUP(grp) ((grp << INAT_GRP_OFFS) | INAT_MODRM) -#define INAT_MAKE_IMM(imm) (imm << INAT_IMM_OFFS) - -/* Attribute search APIs */ -extern insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode); -extern int inat_get_last_prefix_id(insn_byte_t last_pfx); -extern insn_attr_t inat_get_escape_attribute(insn_byte_t opcode, - int lpfx_id, - insn_attr_t esc_attr); -extern insn_attr_t inat_get_group_attribute(insn_byte_t modrm, - int lpfx_id, - insn_attr_t esc_attr); -extern insn_attr_t inat_get_avx_attribute(insn_byte_t opcode, - insn_byte_t vex_m, - insn_byte_t vex_pp); - -/* Attribute checking functions */ -static inline int inat_is_legacy_prefix(insn_attr_t attr) -{ - attr &= INAT_PFX_MASK; - return attr && attr <= INAT_LGCPFX_MAX; -} - -static inline int inat_is_address_size_prefix(insn_attr_t attr) -{ - return (attr & INAT_PFX_MASK) == INAT_PFX_ADDRSZ; -} - -static inline int inat_is_operand_size_prefix(insn_attr_t attr) -{ - return (attr & INAT_PFX_MASK) == INAT_PFX_OPNDSZ; -} - -static inline int inat_is_rex_prefix(insn_attr_t attr) -{ - return (attr & INAT_PFX_MASK) == INAT_PFX_REX; -} - -static inline int inat_last_prefix_id(insn_attr_t attr) -{ - if ((attr & INAT_PFX_MASK) > INAT_LSTPFX_MAX) - return 0; - else - return attr & INAT_PFX_MASK; -} - -static inline int inat_is_vex_prefix(insn_attr_t attr) -{ - attr &= INAT_PFX_MASK; - return attr == INAT_PFX_VEX2 || attr == INAT_PFX_VEX3 || - attr == INAT_PFX_EVEX; -} - -static inline int inat_is_evex_prefix(insn_attr_t attr) -{ - return (attr & INAT_PFX_MASK) == INAT_PFX_EVEX; -} - -static inline int inat_is_vex3_prefix(insn_attr_t attr) -{ - return (attr & INAT_PFX_MASK) == INAT_PFX_VEX3; -} - -static inline int inat_is_escape(insn_attr_t attr) -{ - return attr & INAT_ESC_MASK; -} - -static inline int inat_escape_id(insn_attr_t attr) -{ - return (attr & INAT_ESC_MASK) >> INAT_ESC_OFFS; -} - -static inline int inat_is_group(insn_attr_t attr) -{ - return attr & INAT_GRP_MASK; -} - -static inline int inat_group_id(insn_attr_t attr) -{ - return (attr & INAT_GRP_MASK) >> INAT_GRP_OFFS; -} - -static inline int inat_group_common_attribute(insn_attr_t attr) -{ - return attr & ~INAT_GRP_MASK; -} - -static inline int inat_has_immediate(insn_attr_t attr) -{ - return attr & INAT_IMM_MASK; -} - -static inline int inat_immediate_size(insn_attr_t attr) -{ - return (attr & INAT_IMM_MASK) >> INAT_IMM_OFFS; -} - -static inline int inat_has_modrm(insn_attr_t attr) -{ - return attr & INAT_MODRM; -} - -static inline int inat_is_force64(insn_attr_t attr) -{ - return attr & INAT_FORCE64; -} - -static inline int inat_has_second_immediate(insn_attr_t attr) -{ - return attr & INAT_SCNDIMM; -} - -static inline int inat_has_moffset(insn_attr_t attr) -{ - return attr & INAT_MOFFSET; -} - -static inline int inat_has_variant(insn_attr_t attr) -{ - return attr & INAT_VARIANT; -} - -static inline int inat_accept_vex(insn_attr_t attr) -{ - return attr & INAT_VEXOK; -} - -static inline int inat_must_vex(insn_attr_t attr) -{ - return attr & (INAT_VEXONLY | INAT_EVEXONLY); -} - -static inline int inat_must_evex(insn_attr_t attr) -{ - return attr & INAT_EVEXONLY; -} -#endif diff --git a/tools/objtool/arch/x86/insn/inat_types.h b/tools/objtool/arch/x86/insn/inat_types.h deleted file mode 100644 index cb3c20ce39cf..000000000000 --- a/tools/objtool/arch/x86/insn/inat_types.h +++ /dev/null @@ -1,29 +0,0 @@ -#ifndef _ASM_X86_INAT_TYPES_H -#define _ASM_X86_INAT_TYPES_H -/* - * x86 instruction attributes - * - * Written by Masami Hiramatsu <mhiramat(a)redhat.com> - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. - * - */ - -/* Instruction attributes */ -typedef unsigned int insn_attr_t; -typedef unsigned char insn_byte_t; -typedef signed int insn_value_t; - -#endif diff --git a/tools/objtool/arch/x86/insn/insn.c b/tools/objtool/arch/x86/insn/insn.c deleted file mode 100644 index ca983e2bea8b..000000000000 --- a/tools/objtool/arch/x86/insn/insn.c +++ /dev/null @@ -1,606 +0,0 @@ -/* - * x86 instruction analysis - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. - * - * Copyright (C) IBM Corporation, 2002, 2004, 2009 - */ - -#ifdef __KERNEL__ -#include <linux/string.h> -#else -#include <string.h> -#endif -#include "inat.h" -#include "insn.h" - -/* Verify next sizeof(t) bytes can be on the same instruction */ -#define validate_next(t, insn, n) \ - ((insn)->next_byte + sizeof(t) + n <= (insn)->end_kaddr) - -#define __get_next(t, insn) \ - ({ t r = *(t*)insn->next_byte; insn->next_byte += sizeof(t); r; }) - -#define __peek_nbyte_next(t, insn, n) \ - ({ t r = *(t*)((insn)->next_byte + n); r; }) - -#define get_next(t, insn) \ - ({ if (unlikely(!validate_next(t, insn, 0))) goto err_out; __get_next(t, insn); }) - -#define peek_nbyte_next(t, insn, n) \ - ({ if (unlikely(!validate_next(t, insn, n))) goto err_out; __peek_nbyte_next(t, insn, n); }) - -#define peek_next(t, insn) peek_nbyte_next(t, insn, 0) - -/** - * insn_init() - initialize struct insn - * @insn: &struct insn to be initialized - * @kaddr: address (in kernel memory) of instruction (or copy thereof) - * @x86_64: !0 for 64-bit kernel or 64-bit app - */ -void insn_init(struct insn *insn, const void *kaddr, int buf_len, int x86_64) -{ - /* - * Instructions longer than MAX_INSN_SIZE (15 bytes) are invalid - * even if the input buffer is long enough to hold them. - */ - if (buf_len > MAX_INSN_SIZE) - buf_len = MAX_INSN_SIZE; - - memset(insn, 0, sizeof(*insn)); - insn->kaddr = kaddr; - insn->end_kaddr = kaddr + buf_len; - insn->next_byte = kaddr; - insn->x86_64 = x86_64 ? 1 : 0; - insn->opnd_bytes = 4; - if (x86_64) - insn->addr_bytes = 8; - else - insn->addr_bytes = 4; -} - -/** - * insn_get_prefixes - scan x86 instruction prefix bytes - * @insn: &struct insn containing instruction - * - * Populates the @insn->prefixes bitmap, and updates @insn->next_byte - * to point to the (first) opcode. No effect if @insn->prefixes.got - * is already set. - */ -void insn_get_prefixes(struct insn *insn) -{ - struct insn_field *prefixes = &insn->prefixes; - insn_attr_t attr; - insn_byte_t b, lb; - int i, nb; - - if (prefixes->got) - return; - - nb = 0; - lb = 0; - b = peek_next(insn_byte_t, insn); - attr = inat_get_opcode_attribute(b); - while (inat_is_legacy_prefix(attr)) { - /* Skip if same prefix */ - for (i = 0; i < nb; i++) - if (prefixes->bytes[i] == b) - goto found; - if (nb == 4) - /* Invalid instruction */ - break; - prefixes->bytes[nb++] = b; - if (inat_is_address_size_prefix(attr)) { - /* address size switches 2/4 or 4/8 */ - if (insn->x86_64) - insn->addr_bytes ^= 12; - else - insn->addr_bytes ^= 6; - } else if (inat_is_operand_size_prefix(attr)) { - /* oprand size switches 2/4 */ - insn->opnd_bytes ^= 6; - } -found: - prefixes->nbytes++; - insn->next_byte++; - lb = b; - b = peek_next(insn_byte_t, insn); - attr = inat_get_opcode_attribute(b); - } - /* Set the last prefix */ - if (lb && lb != insn->prefixes.bytes[3]) { - if (unlikely(insn->prefixes.bytes[3])) { - /* Swap the last prefix */ - b = insn->prefixes.bytes[3]; - for (i = 0; i < nb; i++) - if (prefixes->bytes[i] == lb) - prefixes->bytes[i] = b; - } - insn->prefixes.bytes[3] = lb; - } - - /* Decode REX prefix */ - if (insn->x86_64) { - b = peek_next(insn_byte_t, insn); - attr = inat_get_opcode_attribute(b); - if (inat_is_rex_prefix(attr)) { - insn->rex_prefix.value = b; - insn->rex_prefix.nbytes = 1; - insn->next_byte++; - if (X86_REX_W(b)) - /* REX.W overrides opnd_size */ - insn->opnd_bytes = 8; - } - } - insn->rex_prefix.got = 1; - - /* Decode VEX prefix */ - b = peek_next(insn_byte_t, insn); - attr = inat_get_opcode_attribute(b); - if (inat_is_vex_prefix(attr)) { - insn_byte_t b2 = peek_nbyte_next(insn_byte_t, insn, 1); - if (!insn->x86_64) { - /* - * In 32-bits mode, if the [7:6] bits (mod bits of - * ModRM) on the second byte are not 11b, it is - * LDS or LES or BOUND. - */ - if (X86_MODRM_MOD(b2) != 3) - goto vex_end; - } - insn->vex_prefix.bytes[0] = b; - insn->vex_prefix.bytes[1] = b2; - if (inat_is_evex_prefix(attr)) { - b2 = peek_nbyte_next(insn_byte_t, insn, 2); - insn->vex_prefix.bytes[2] = b2; - b2 = peek_nbyte_next(insn_byte_t, insn, 3); - insn->vex_prefix.bytes[3] = b2; - insn->vex_prefix.nbytes = 4; - insn->next_byte += 4; - if (insn->x86_64 && X86_VEX_W(b2)) - /* VEX.W overrides opnd_size */ - insn->opnd_bytes = 8; - } else if (inat_is_vex3_prefix(attr)) { - b2 = peek_nbyte_next(insn_byte_t, insn, 2); - insn->vex_prefix.bytes[2] = b2; - insn->vex_prefix.nbytes = 3; - insn->next_byte += 3; - if (insn->x86_64 && X86_VEX_W(b2)) - /* VEX.W overrides opnd_size */ - insn->opnd_bytes = 8; - } else { - /* - * For VEX2, fake VEX3-like byte#2. - * Makes it easier to decode vex.W, vex.vvvv, - * vex.L and vex.pp. Masking with 0x7f sets vex.W == 0. - */ - insn->vex_prefix.bytes[2] = b2 & 0x7f; - insn->vex_prefix.nbytes = 2; - insn->next_byte += 2; - } - } -vex_end: - insn->vex_prefix.got = 1; - - prefixes->got = 1; - -err_out: - return; -} - -/** - * insn_get_opcode - collect opcode(s) - * @insn: &struct insn containing instruction - * - * Populates @insn->opcode, updates @insn->next_byte to point past the - * opcode byte(s), and set @insn->attr (except for groups). - * If necessary, first collects any preceding (prefix) bytes. - * Sets @insn->opcode.value = opcode1. No effect if @insn->opcode.got - * is already 1. - */ -void insn_get_opcode(struct insn *insn) -{ - struct insn_field *opcode = &insn->opcode; - insn_byte_t op; - int pfx_id; - if (opcode->got) - return; - if (!insn->prefixes.got) - insn_get_prefixes(insn); - - /* Get first opcode */ - op = get_next(insn_byte_t, insn); - opcode->bytes[0] = op; - opcode->nbytes = 1; - - /* Check if there is VEX prefix or not */ - if (insn_is_avx(insn)) { - insn_byte_t m, p; - m = insn_vex_m_bits(insn); - p = insn_vex_p_bits(insn); - insn->attr = inat_get_avx_attribute(op, m, p); - if ((inat_must_evex(insn->attr) && !insn_is_evex(insn)) || - (!inat_accept_vex(insn->attr) && - !inat_is_group(insn->attr))) - insn->attr = 0; /* This instruction is bad */ - goto end; /* VEX has only 1 byte for opcode */ - } - - insn->attr = inat_get_opcode_attribute(op); - while (inat_is_escape(insn->attr)) { - /* Get escaped opcode */ - op = get_next(insn_byte_t, insn); - opcode->bytes[opcode->nbytes++] = op; - pfx_id = insn_last_prefix_id(insn); - insn->attr = inat_get_escape_attribute(op, pfx_id, insn->attr); - } - if (inat_must_vex(insn->attr)) - insn->attr = 0; /* This instruction is bad */ -end: - opcode->got = 1; - -err_out: - return; -} - -/** - * insn_get_modrm - collect ModRM byte, if any - * @insn: &struct insn containing instruction - * - * Populates @insn->modrm and updates @insn->next_byte to point past the - * ModRM byte, if any. If necessary, first collects the preceding bytes - * (prefixes and opcode(s)). No effect if @insn->modrm.got is already 1. - */ -void insn_get_modrm(struct insn *insn) -{ - struct insn_field *modrm = &insn->modrm; - insn_byte_t pfx_id, mod; - if (modrm->got) - return; - if (!insn->opcode.got) - insn_get_opcode(insn); - - if (inat_has_modrm(insn->attr)) { - mod = get_next(insn_byte_t, insn); - modrm->value = mod; - modrm->nbytes = 1; - if (inat_is_group(insn->attr)) { - pfx_id = insn_last_prefix_id(insn); - insn->attr = inat_get_group_attribute(mod, pfx_id, - insn->attr); - if (insn_is_avx(insn) && !inat_accept_vex(insn->attr)) - insn->attr = 0; /* This is bad */ - } - } - - if (insn->x86_64 && inat_is_force64(insn->attr)) - insn->opnd_bytes = 8; - modrm->got = 1; - -err_out: - return; -} - - -/** - * insn_rip_relative() - Does instruction use RIP-relative addressing mode? - * @insn: &struct insn containing instruction - * - * If necessary, first collects the instruction up to and including the - * ModRM byte. No effect if @insn->x86_64 is 0. - */ -int insn_rip_relative(struct insn *insn) -{ - struct insn_field *modrm = &insn->modrm; - - if (!insn->x86_64) - return 0; - if (!modrm->got) - insn_get_modrm(insn); - /* - * For rip-relative instructions, the mod field (top 2 bits) - * is zero and the r/m field (bottom 3 bits) is 0x5. - */ - return (modrm->nbytes && (modrm->value & 0xc7) == 0x5); -} - -/** - * insn_get_sib() - Get the SIB byte of instruction - * @insn: &struct insn containing instruction - * - * If necessary, first collects the instruction up to and including the - * ModRM byte. - */ -void insn_get_sib(struct insn *insn) -{ - insn_byte_t modrm; - - if (insn->sib.got) - return; - if (!insn->modrm.got) - insn_get_modrm(insn); - if (insn->modrm.nbytes) { - modrm = (insn_byte_t)insn->modrm.value; - if (insn->addr_bytes != 2 && - X86_MODRM_MOD(modrm) != 3 && X86_MODRM_RM(modrm) == 4) { - insn->sib.value = get_next(insn_byte_t, insn); - insn->sib.nbytes = 1; - } - } - insn->sib.got = 1; - -err_out: - return; -} - - -/** - * insn_get_displacement() - Get the displacement of instruction - * @insn: &struct insn containing instruction - * - * If necessary, first collects the instruction up to and including the - * SIB byte. - * Displacement value is sign-expanded. - */ -void insn_get_displacement(struct insn *insn) -{ - insn_byte_t mod, rm, base; - - if (insn->displacement.got) - return; - if (!insn->sib.got) - insn_get_sib(insn); - if (insn->modrm.nbytes) { - /* - * Interpreting the modrm byte: - * mod = 00 - no displacement fields (exceptions below) - * mod = 01 - 1-byte displacement field - * mod = 10 - displacement field is 4 bytes, or 2 bytes if - * address size = 2 (0x67 prefix in 32-bit mode) - * mod = 11 - no memory operand - * - * If address size = 2... - * mod = 00, r/m = 110 - displacement field is 2 bytes - * - * If address size != 2... - * mod != 11, r/m = 100 - SIB byte exists - * mod = 00, SIB base = 101 - displacement field is 4 bytes - * mod = 00, r/m = 101 - rip-relative addressing, displacement - * field is 4 bytes - */ - mod = X86_MODRM_MOD(insn->modrm.value); - rm = X86_MODRM_RM(insn->modrm.value); - base = X86_SIB_BASE(insn->sib.value); - if (mod == 3) - goto out; - if (mod == 1) { - insn->displacement.value = get_next(signed char, insn); - insn->displacement.nbytes = 1; - } else if (insn->addr_bytes == 2) { - if ((mod == 0 && rm == 6) || mod == 2) { - insn->displacement.value = - get_next(short, insn); - insn->displacement.nbytes = 2; - } - } else { - if ((mod == 0 && rm == 5) || mod == 2 || - (mod == 0 && base == 5)) { - insn->displacement.value = get_next(int, insn); - insn->displacement.nbytes = 4; - } - } - } -out: - insn->displacement.got = 1; - -err_out: - return; -} - -/* Decode moffset16/32/64. Return 0 if failed */ -static int __get_moffset(struct insn *insn) -{ - switch (insn->addr_bytes) { - case 2: - insn->moffset1.value = get_next(short, insn); - insn->moffset1.nbytes = 2; - break; - case 4: - insn->moffset1.value = get_next(int, insn); - insn->moffset1.nbytes = 4; - break; - case 8: - insn->moffset1.value = get_next(int, insn); - insn->moffset1.nbytes = 4; - insn->moffset2.value = get_next(int, insn); - insn->moffset2.nbytes = 4; - break; - default: /* opnd_bytes must be modified manually */ - goto err_out; - } - insn->moffset1.got = insn->moffset2.got = 1; - - return 1; - -err_out: - return 0; -} - -/* Decode imm v32(Iz). Return 0 if failed */ -static int __get_immv32(struct insn *insn) -{ - switch (insn->opnd_bytes) { - case 2: - insn->immediate.value = get_next(short, insn); - insn->immediate.nbytes = 2; - break; - case 4: - case 8: - insn->immediate.value = get_next(int, insn); - insn->immediate.nbytes = 4; - break; - default: /* opnd_bytes must be modified manually */ - goto err_out; - } - - return 1; - -err_out: - return 0; -} - -/* Decode imm v64(Iv/Ov), Return 0 if failed */ -static int __get_immv(struct insn *insn) -{ - switch (insn->opnd_bytes) { - case 2: - insn->immediate1.value = get_next(short, insn); - insn->immediate1.nbytes = 2; - break; - case 4: - insn->immediate1.value = get_next(int, insn); - insn->immediate1.nbytes = 4; - break; - case 8: - insn->immediate1.value = get_next(int, insn); - insn->immediate1.nbytes = 4; - insn->immediate2.value = get_next(int, insn); - insn->immediate2.nbytes = 4; - break; - default: /* opnd_bytes must be modified manually */ - goto err_out; - } - insn->immediate1.got = insn->immediate2.got = 1; - - return 1; -err_out: - return 0; -} - -/* Decode ptr16:16/32(Ap) */ -static int __get_immptr(struct insn *insn) -{ - switch (insn->opnd_bytes) { - case 2: - insn->immediate1.value = get_next(short, insn); - insn->immediate1.nbytes = 2; - break; - case 4: - insn->immediate1.value = get_next(int, insn); - insn->immediate1.nbytes = 4; - break; - case 8: - /* ptr16:64 is not exist (no segment) */ - return 0; - default: /* opnd_bytes must be modified manually */ - goto err_out; - } - insn->immediate2.value = get_next(unsigned short, insn); - insn->immediate2.nbytes = 2; - insn->immediate1.got = insn->immediate2.got = 1; - - return 1; -err_out: - return 0; -} - -/** - * insn_get_immediate() - Get the immediates of instruction - * @insn: &struct insn containing instruction - * - * If necessary, first collects the instruction up to and including the - * displacement bytes. - * Basically, most of immediates are sign-expanded. Unsigned-value can be - * get by bit masking with ((1 << (nbytes * 8)) - 1) - */ -void insn_get_immediate(struct insn *insn) -{ - if (insn->immediate.got) - return; - if (!insn->displacement.got) - insn_get_displacement(insn); - - if (inat_has_moffset(insn->attr)) { - if (!__get_moffset(insn)) - goto err_out; - goto done; - } - - if (!inat_has_immediate(insn->attr)) - /* no immediates */ - goto done; - - switch (inat_immediate_size(insn->attr)) { - case INAT_IMM_BYTE: - insn->immediate.value = get_next(signed char, insn); - insn->immediate.nbytes = 1; - break; - case INAT_IMM_WORD: - insn->immediate.value = get_next(short, insn); - insn->immediate.nbytes = 2; - break; - case INAT_IMM_DWORD: - insn->immediate.value = get_next(int, insn); - insn->immediate.nbytes = 4; - break; - case INAT_IMM_QWORD: - insn->immediate1.value = get_next(int, insn); - insn->immediate1.nbytes = 4; - insn->immediate2.value = get_next(int, insn); - insn->immediate2.nbytes = 4; - break; - case INAT_IMM_PTR: - if (!__get_immptr(insn)) - goto err_out; - break; - case INAT_IMM_VWORD32: - if (!__get_immv32(insn)) - goto err_out; - break; - case INAT_IMM_VWORD: - if (!__get_immv(insn)) - goto err_out; - break; - default: - /* Here, insn must have an immediate, but failed */ - goto err_out; - } - if (inat_has_second_immediate(insn->attr)) { - insn->immediate2.value = get_next(signed char, insn); - insn->immediate2.nbytes = 1; - } -done: - insn->immediate.got = 1; - -err_out: - return; -} - -/** - * insn_get_length() - Get the length of instruction - * @insn: &struct insn containing instruction - * - * If necessary, first collects the instruction up to and including the - * immediates bytes. - */ -void insn_get_length(struct insn *insn) -{ - if (insn->length) - return; - if (!insn->immediate.got) - insn_get_immediate(insn); - insn->length = (unsigned char)((unsigned long)insn->next_byte - - (unsigned long)insn->kaddr); -} diff --git a/tools/objtool/arch/x86/insn/insn.h b/tools/objtool/arch/x86/insn/insn.h deleted file mode 100644 index e23578c7b1be..000000000000 --- a/tools/objtool/arch/x86/insn/insn.h +++ /dev/null @@ -1,211 +0,0 @@ -#ifndef _ASM_X86_INSN_H -#define _ASM_X86_INSN_H -/* - * x86 instruction analysis - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. - * - * Copyright (C) IBM Corporation, 2009 - */ - -/* insn_attr_t is defined in inat.h */ -#include "inat.h" - -struct insn_field { - union { - insn_value_t value; - insn_byte_t bytes[4]; - }; - /* !0 if we've run insn_get_xxx() for this field */ - unsigned char got; - unsigned char nbytes; -}; - -struct insn { - struct insn_field prefixes; /* - * Prefixes - * prefixes.bytes[3]: last prefix - */ - struct insn_field rex_prefix; /* REX prefix */ - struct insn_field vex_prefix; /* VEX prefix */ - struct insn_field opcode; /* - * opcode.bytes[0]: opcode1 - * opcode.bytes[1]: opcode2 - * opcode.bytes[2]: opcode3 - */ - struct insn_field modrm; - struct insn_field sib; - struct insn_field displacement; - union { - struct insn_field immediate; - struct insn_field moffset1; /* for 64bit MOV */ - struct insn_field immediate1; /* for 64bit imm or off16/32 */ - }; - union { - struct insn_field moffset2; /* for 64bit MOV */ - struct insn_field immediate2; /* for 64bit imm or seg16 */ - }; - - insn_attr_t attr; - unsigned char opnd_bytes; - unsigned char addr_bytes; - unsigned char length; - unsigned char x86_64; - - const insn_byte_t *kaddr; /* kernel address of insn to analyze */ - const insn_byte_t *end_kaddr; /* kernel address of last insn in buffer */ - const insn_byte_t *next_byte; -}; - -#define MAX_INSN_SIZE 15 - -#define X86_MODRM_MOD(modrm) (((modrm) & 0xc0) >> 6) -#define X86_MODRM_REG(modrm) (((modrm) & 0x38) >> 3) -#define X86_MODRM_RM(modrm) ((modrm) & 0x07) - -#define X86_SIB_SCALE(sib) (((sib) & 0xc0) >> 6) -#define X86_SIB_INDEX(sib) (((sib) & 0x38) >> 3) -#define X86_SIB_BASE(sib) ((sib) & 0x07) - -#define X86_REX_W(rex) ((rex) & 8) -#define X86_REX_R(rex) ((rex) & 4) -#define X86_REX_X(rex) ((rex) & 2) -#define X86_REX_B(rex) ((rex) & 1) - -/* VEX bit flags */ -#define X86_VEX_W(vex) ((vex) & 0x80) /* VEX3 Byte2 */ -#define X86_VEX_R(vex) ((vex) & 0x80) /* VEX2/3 Byte1 */ -#define X86_VEX_X(vex) ((vex) & 0x40) /* VEX3 Byte1 */ -#define X86_VEX_B(vex) ((vex) & 0x20) /* VEX3 Byte1 */ -#define X86_VEX_L(vex) ((vex) & 0x04) /* VEX3 Byte2, VEX2 Byte1 */ -/* VEX bit fields */ -#define X86_EVEX_M(vex) ((vex) & 0x03) /* EVEX Byte1 */ -#define X86_VEX3_M(vex) ((vex) & 0x1f) /* VEX3 Byte1 */ -#define X86_VEX2_M 1 /* VEX2.M always 1 */ -#define X86_VEX_V(vex) (((vex) & 0x78) >> 3) /* VEX3 Byte2, VEX2 Byte1 */ -#define X86_VEX_P(vex) ((vex) & 0x03) /* VEX3 Byte2, VEX2 Byte1 */ -#define X86_VEX_M_MAX 0x1f /* VEX3.M Maximum value */ - -extern void insn_init(struct insn *insn, const void *kaddr, int buf_len, int x86_64); -extern void insn_get_prefixes(struct insn *insn); -extern void insn_get_opcode(struct insn *insn); -extern void insn_get_modrm(struct insn *insn); -extern void insn_get_sib(struct insn *insn); -extern void insn_get_displacement(struct insn *insn); -extern void insn_get_immediate(struct insn *insn); -extern void insn_get_length(struct insn *insn); - -/* Attribute will be determined after getting ModRM (for opcode groups) */ -static inline void insn_get_attribute(struct insn *insn) -{ - insn_get_modrm(insn); -} - -/* Instruction uses RIP-relative addressing */ -extern int insn_rip_relative(struct insn *insn); - -/* Init insn for kernel text */ -static inline void kernel_insn_init(struct insn *insn, - const void *kaddr, int buf_len) -{ -#ifdef CONFIG_X86_64 - insn_init(insn, kaddr, buf_len, 1); -#else /* CONFIG_X86_32 */ - insn_init(insn, kaddr, buf_len, 0); -#endif -} - -static inline int insn_is_avx(struct insn *insn) -{ - if (!insn->prefixes.got) - insn_get_prefixes(insn); - return (insn->vex_prefix.value != 0); -} - -static inline int insn_is_evex(struct insn *insn) -{ - if (!insn->prefixes.got) - insn_get_prefixes(insn); - return (insn->vex_prefix.nbytes == 4); -} - -/* Ensure this instruction is decoded completely */ -static inline int insn_complete(struct insn *insn) -{ - return insn->opcode.got && insn->modrm.got && insn->sib.got && - insn->displacement.got && insn->immediate.got; -} - -static inline insn_byte_t insn_vex_m_bits(struct insn *insn) -{ - if (insn->vex_prefix.nbytes == 2) /* 2 bytes VEX */ - return X86_VEX2_M; - else if (insn->vex_prefix.nbytes == 3) /* 3 bytes VEX */ - return X86_VEX3_M(insn->vex_prefix.bytes[1]); - else /* EVEX */ - return X86_EVEX_M(insn->vex_prefix.bytes[1]); -} - -static inline insn_byte_t insn_vex_p_bits(struct insn *insn) -{ - if (insn->vex_prefix.nbytes == 2) /* 2 bytes VEX */ - return X86_VEX_P(insn->vex_prefix.bytes[1]); - else - return X86_VEX_P(insn->vex_prefix.bytes[2]); -} - -/* Get the last prefix id from last prefix or VEX prefix */ -static inline int insn_last_prefix_id(struct insn *insn) -{ - if (insn_is_avx(insn)) - return insn_vex_p_bits(insn); /* VEX_p is a SIMD prefix id */ - - if (insn->prefixes.bytes[3]) - return inat_get_last_prefix_id(insn->prefixes.bytes[3]); - - return 0; -} - -/* Offset of each field from kaddr */ -static inline int insn_offset_rex_prefix(struct insn *insn) -{ - return insn->prefixes.nbytes; -} -static inline int insn_offset_vex_prefix(struct insn *insn) -{ - return insn_offset_rex_prefix(insn) + insn->rex_prefix.nbytes; -} -static inline int insn_offset_opcode(struct insn *insn) -{ - return insn_offset_vex_prefix(insn) + insn->vex_prefix.nbytes; -} -static inline int insn_offset_modrm(struct insn *insn) -{ - return insn_offset_opcode(insn) + insn->opcode.nbytes; -} -static inline int insn_offset_sib(struct insn *insn) -{ - return insn_offset_modrm(insn) + insn->modrm.nbytes; -} -static inline int insn_offset_displacement(struct insn *insn) -{ - return insn_offset_sib(insn) + insn->sib.nbytes; -} -static inline int insn_offset_immediate(struct insn *insn) -{ - return insn_offset_displacement(insn) + insn->displacement.nbytes; -} - -#endif /* _ASM_X86_INSN_H */ diff --git a/tools/objtool/arch/x86/insn/x86-opcode-map.txt b/tools/objtool/arch/x86/insn/x86-opcode-map.txt deleted file mode 100644 index 12e377184ee4..000000000000 --- a/tools/objtool/arch/x86/insn/x86-opcode-map.txt +++ /dev/null @@ -1,1063 +0,0 @@ -# x86 Opcode Maps -# -# This is (mostly) based on following documentations. -# - Intel(R) 64 and IA-32 Architectures Software Developer's Manual Vol.2C -# (#326018-047US, June 2013) -# -#<Opcode maps> -# Table: table-name -# Referrer: escaped-name -# AVXcode: avx-code -# opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...] -# (or) -# opcode: escape # escaped-name -# EndTable -# -# mnemonics that begin with lowercase 'v' accept a VEX or EVEX prefix -# mnemonics that begin with lowercase 'k' accept a VEX prefix -# -#<group maps> -# GrpTable: GrpXXX -# reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...] -# EndTable -# -# AVX Superscripts -# (ev): this opcode requires EVEX prefix. -# (evo): this opcode is changed by EVEX prefix (EVEX opcode) -# (v): this opcode requires VEX prefix. -# (v1): this opcode only supports 128bit VEX. -# -# Last Prefix Superscripts -# - (66): the last prefix is 0x66 -# - (F3): the last prefix is 0xF3 -# - (F2): the last prefix is 0xF2 -# - (!F3) : the last prefix is not 0xF3 (including non-last prefix case) -# - (66&F2): Both 0x66 and 0xF2 prefixes are specified. - -Table: one byte opcode -Referrer: -AVXcode: -# 0x00 - 0x0f -00: ADD Eb,Gb -01: ADD Ev,Gv -02: ADD Gb,Eb -03: ADD Gv,Ev -04: ADD AL,Ib -05: ADD rAX,Iz -06: PUSH ES (i64) -07: POP ES (i64) -08: OR Eb,Gb -09: OR Ev,Gv -0a: OR Gb,Eb -0b: OR Gv,Ev -0c: OR AL,Ib -0d: OR rAX,Iz -0e: PUSH CS (i64) -0f: escape # 2-byte escape -# 0x10 - 0x1f -10: ADC Eb,Gb -11: ADC Ev,Gv -12: ADC Gb,Eb -13: ADC Gv,Ev -14: ADC AL,Ib -15: ADC rAX,Iz -16: PUSH SS (i64) -17: POP SS (i64) -18: SBB Eb,Gb -19: SBB Ev,Gv -1a: SBB Gb,Eb -1b: SBB Gv,Ev -1c: SBB AL,Ib -1d: SBB rAX,Iz -1e: PUSH DS (i64) -1f: POP DS (i64) -# 0x20 - 0x2f -20: AND Eb,Gb -21: AND Ev,Gv -22: AND Gb,Eb -23: AND Gv,Ev -24: AND AL,Ib -25: AND rAx,Iz -26: SEG=ES (Prefix) -27: DAA (i64) -28: SUB Eb,Gb -29: SUB Ev,Gv -2a: SUB Gb,Eb -2b: SUB Gv,Ev -2c: SUB AL,Ib -2d: SUB rAX,Iz -2e: SEG=CS (Prefix) -2f: DAS (i64) -# 0x30 - 0x3f -30: XOR Eb,Gb -31: XOR Ev,Gv -32: XOR Gb,Eb -33: XOR Gv,Ev -34: XOR AL,Ib -35: XOR rAX,Iz -36: SEG=SS (Prefix) -37: AAA (i64) -38: CMP Eb,Gb -39: CMP Ev,Gv -3a: CMP Gb,Eb -3b: CMP Gv,Ev -3c: CMP AL,Ib -3d: CMP rAX,Iz -3e: SEG=DS (Prefix) -3f: AAS (i64) -# 0x40 - 0x4f -40: INC eAX (i64) | REX (o64) -41: INC eCX (i64) | REX.B (o64) -42: INC eDX (i64) | REX.X (o64) -43: INC eBX (i64) | REX.XB (o64) -44: INC eSP (i64) | REX.R (o64) -45: INC eBP (i64) | REX.RB (o64) -46: INC eSI (i64) | REX.RX (o64) -47: INC eDI (i64) | REX.RXB (o64) -48: DEC eAX (i64) | REX.W (o64) -49: DEC eCX (i64) | REX.WB (o64) -4a: DEC eDX (i64) | REX.WX (o64) -4b: DEC eBX (i64) | REX.WXB (o64) -4c: DEC eSP (i64) | REX.WR (o64) -4d: DEC eBP (i64) | REX.WRB (o64) -4e: DEC eSI (i64) | REX.WRX (o64) -4f: DEC eDI (i64) | REX.WRXB (o64) -# 0x50 - 0x5f -50: PUSH rAX/r8 (d64) -51: PUSH rCX/r9 (d64) -52: PUSH rDX/r10 (d64) -53: PUSH rBX/r11 (d64) -54: PUSH rSP/r12 (d64) -55: PUSH rBP/r13 (d64) -56: PUSH rSI/r14 (d64) -57: PUSH rDI/r15 (d64) -58: POP rAX/r8 (d64) -59: POP rCX/r9 (d64) -5a: POP rDX/r10 (d64) -5b: POP rBX/r11 (d64) -5c: POP rSP/r12 (d64) -5d: POP rBP/r13 (d64) -5e: POP rSI/r14 (d64) -5f: POP rDI/r15 (d64) -# 0x60 - 0x6f -60: PUSHA/PUSHAD (i64) -61: POPA/POPAD (i64) -62: BOUND Gv,Ma (i64) | EVEX (Prefix) -63: ARPL Ew,Gw (i64) | MOVSXD Gv,Ev (o64) -64: SEG=FS (Prefix) -65: SEG=GS (Prefix) -66: Operand-Size (Prefix) -67: Address-Size (Prefix) -68: PUSH Iz (d64) -69: IMUL Gv,Ev,Iz -6a: PUSH Ib (d64) -6b: IMUL Gv,Ev,Ib -6c: INS/INSB Yb,DX -6d: INS/INSW/INSD Yz,DX -6e: OUTS/OUTSB DX,Xb -6f: OUTS/OUTSW/OUTSD DX,Xz -# 0x70 - 0x7f -70: JO Jb -71: JNO Jb -72: JB/JNAE/JC Jb -73: JNB/JAE/JNC Jb -74: JZ/JE Jb -75: JNZ/JNE Jb -76: JBE/JNA Jb -77: JNBE/JA Jb -78: JS Jb -79: JNS Jb -7a: JP/JPE Jb -7b: JNP/JPO Jb -7c: JL/JNGE Jb -7d: JNL/JGE Jb -7e: JLE/JNG Jb -7f: JNLE/JG Jb -# 0x80 - 0x8f -80: Grp1 Eb,Ib (1A) -81: Grp1 Ev,Iz (1A) -82: Grp1 Eb,Ib (1A),(i64) -83: Grp1 Ev,Ib (1A) -84: TEST Eb,Gb -85: TEST Ev,Gv -86: XCHG Eb,Gb -87: XCHG Ev,Gv -88: MOV Eb,Gb -89: MOV Ev,Gv -8a: MOV Gb,Eb -8b: MOV Gv,Ev -8c: MOV Ev,Sw -8d: LEA Gv,M -8e: MOV Sw,Ew -8f: Grp1A (1A) | POP Ev (d64) -# 0x90 - 0x9f -90: NOP | PAUSE (F3) | XCHG r8,rAX -91: XCHG rCX/r9,rAX -92: XCHG rDX/r10,rAX -93: XCHG rBX/r11,rAX -94: XCHG rSP/r12,rAX -95: XCHG rBP/r13,rAX -96: XCHG rSI/r14,rAX -97: XCHG rDI/r15,rAX -98: CBW/CWDE/CDQE -99: CWD/CDQ/CQO -9a: CALLF Ap (i64) -9b: FWAIT/WAIT -9c: PUSHF/D/Q Fv (d64) -9d: POPF/D/Q Fv (d64) -9e: SAHF -9f: LAHF -# 0xa0 - 0xaf -a0: MOV AL,Ob -a1: MOV rAX,Ov -a2: MOV Ob,AL -a3: MOV Ov,rAX -a4: MOVS/B Yb,Xb -a5: MOVS/W/D/Q Yv,Xv -a6: CMPS/B Xb,Yb -a7: CMPS/W/D Xv,Yv -a8: TEST AL,Ib -a9: TEST rAX,Iz -aa: STOS/B Yb,AL -ab: STOS/W/D/Q Yv,rAX -ac: LODS/B AL,Xb -ad: LODS/W/D/Q rAX,Xv -ae: SCAS/B AL,Yb -# Note: The May 2011 Intel manual shows Xv for the second parameter of the -# next instruction but Yv is correct -af: SCAS/W/D/Q rAX,Yv -# 0xb0 - 0xbf -b0: MOV AL/R8L,Ib -b1: MOV CL/R9L,Ib -b2: MOV DL/R10L,Ib -b3: MOV BL/R11L,Ib -b4: MOV AH/R12L,Ib -b5: MOV CH/R13L,Ib -b6: MOV DH/R14L,Ib -b7: MOV BH/R15L,Ib -b8: MOV rAX/r8,Iv -b9: MOV rCX/r9,Iv -ba: MOV rDX/r10,Iv -bb: MOV rBX/r11,Iv -bc: MOV rSP/r12,Iv -bd: MOV rBP/r13,Iv -be: MOV rSI/r14,Iv -bf: MOV rDI/r15,Iv -# 0xc0 - 0xcf -c0: Grp2 Eb,Ib (1A) -c1: Grp2 Ev,Ib (1A) -c2: RETN Iw (f64) -c3: RETN -c4: LES Gz,Mp (i64) | VEX+2byte (Prefix) -c5: LDS Gz,Mp (i64) | VEX+1byte (Prefix) -c6: Grp11A Eb,Ib (1A) -c7: Grp11B Ev,Iz (1A) -c8: ENTER Iw,Ib -c9: LEAVE (d64) -ca: RETF Iw -cb: RETF -cc: INT3 -cd: INT Ib -ce: INTO (i64) -cf: IRET/D/Q -# 0xd0 - 0xdf -d0: Grp2 Eb,1 (1A) -d1: Grp2 Ev,1 (1A) -d2: Grp2 Eb,CL (1A) -d3: Grp2 Ev,CL (1A) -d4: AAM Ib (i64) -d5: AAD Ib (i64) -d6: -d7: XLAT/XLATB -d8: ESC -d9: ESC -da: ESC -db: ESC -dc: ESC -dd: ESC -de: ESC -df: ESC -# 0xe0 - 0xef -# Note: "forced64" is Intel CPU behavior: they ignore 0x66 prefix -# in 64-bit mode. AMD CPUs accept 0x66 prefix, it causes RIP truncation -# to 16 bits. In 32-bit mode, 0x66 is accepted by both Intel and AMD. -e0: LOOPNE/LOOPNZ Jb (f64) -e1: LOOPE/LOOPZ Jb (f64) -e2: LOOP Jb (f64) -e3: JrCXZ Jb (f64) -e4: IN AL,Ib -e5: IN eAX,Ib -e6: OUT Ib,AL -e7: OUT Ib,eAX -# With 0x66 prefix in 64-bit mode, for AMD CPUs immediate offset -# in "near" jumps and calls is 16-bit. For CALL, -# push of return address is 16-bit wide, RSP is decremented by 2 -# but is not truncated to 16 bits, unlike RIP. -e8: CALL Jz (f64) -e9: JMP-near Jz (f64) -ea: JMP-far Ap (i64) -eb: JMP-short Jb (f64) -ec: IN AL,DX -ed: IN eAX,DX -ee: OUT DX,AL -ef: OUT DX,eAX -# 0xf0 - 0xff -f0: LOCK (Prefix) -f1: -f2: REPNE (Prefix) | XACQUIRE (Prefix) -f3: REP/REPE (Prefix) | XRELEASE (Prefix) -f4: HLT -f5: CMC -f6: Grp3_1 Eb (1A) -f7: Grp3_2 Ev (1A) -f8: CLC -f9: STC -fa: CLI -fb: STI -fc: CLD -fd: STD -fe: Grp4 (1A) -ff: Grp5 (1A) -EndTable - -Table: 2-byte opcode (0x0f) -Referrer: 2-byte escape -AVXcode: 1 -# 0x0f 0x00-0x0f -00: Grp6 (1A) -01: Grp7 (1A) -02: LAR Gv,Ew -03: LSL Gv,Ew -04: -05: SYSCALL (o64) -06: CLTS -07: SYSRET (o64) -08: INVD -09: WBINVD -0a: -0b: UD2 (1B) -0c: -# AMD's prefetch group. Intel supports prefetchw(/1) only. -0d: GrpP -0e: FEMMS -# 3DNow! uses the last imm byte as opcode extension. -0f: 3DNow! Pq,Qq,Ib -# 0x0f 0x10-0x1f -# NOTE: According to Intel SDM opcode map, vmovups and vmovupd has no operands -# but it actually has operands. And also, vmovss and vmovsd only accept 128bit. -# MOVSS/MOVSD has too many forms(3) on SDM. This map just shows a typical form. -# Many AVX instructions lack v1 superscript, according to Intel AVX-Prgramming -# Reference A.1 -10: vmovups Vps,Wps | vmovupd Vpd,Wpd (66) | vmovss Vx,Hx,Wss (F3),(v1) | vmovsd Vx,Hx,Wsd (F2),(v1) -11: vmovups Wps,Vps | vmovupd Wpd,Vpd (66) | vmovss Wss,Hx,Vss (F3),(v1) | vmovsd Wsd,Hx,Vsd (F2),(v1) -12: vmovlps Vq,Hq,Mq (v1) | vmovhlps Vq,Hq,Uq (v1) | vmovlpd Vq,Hq,Mq (66),(v1) | vmovsldup Vx,Wx (F3) | vmovddup Vx,Wx (F2) -13: vmovlps Mq,Vq (v1) | vmovlpd Mq,Vq (66),(v1) -14: vunpcklps Vx,Hx,Wx | vunpcklpd Vx,Hx,Wx (66) -15: vunpckhps Vx,Hx,Wx | vunpckhpd Vx,Hx,Wx (66) -16: vmovhps Vdq,Hq,Mq (v1) | vmovlhps Vdq,Hq,Uq (v1) | vmovhpd Vdq,Hq,Mq (66),(v1) | vmovshdup Vx,Wx (F3) -17: vmovhps Mq,Vq (v1) | vmovhpd Mq,Vq (66),(v1) -18: Grp16 (1A) -19: -# Intel SDM opcode map does not list MPX instructions. For now using Gv for -# bnd registers and Ev for everything else is OK because the instruction -# decoder does not use the information except as an indication that there is -# a ModR/M byte. -1a: BNDCL Gv,Ev (F3) | BNDCU Gv,Ev (F2) | BNDMOV Gv,Ev (66) | BNDLDX Gv,Ev -1b: BNDCN Gv,Ev (F2) | BNDMOV Ev,Gv (66) | BNDMK Gv,Ev (F3) | BNDSTX Ev,Gv -1c: -1d: -1e: -1f: NOP Ev -# 0x0f 0x20-0x2f -20: MOV Rd,Cd -21: MOV Rd,Dd -22: MOV Cd,Rd -23: MOV Dd,Rd -24: -25: -26: -27: -28: vmovaps Vps,Wps | vmovapd Vpd,Wpd (66) -29: vmovaps Wps,Vps | vmovapd Wpd,Vpd (66) -2a: cvtpi2ps Vps,Qpi | cvtpi2pd Vpd,Qpi (66) | vcvtsi2ss Vss,Hss,Ey (F3),(v1) | vcvtsi2sd Vsd,Hsd,Ey (F2),(v1) -2b: vmovntps Mps,Vps | vmovntpd Mpd,Vpd (66) -2c: cvttps2pi Ppi,Wps | cvttpd2pi Ppi,Wpd (66) | vcvttss2si Gy,Wss (F3),(v1) | vcvttsd2si Gy,Wsd (F2),(v1) -2d: cvtps2pi Ppi,Wps | cvtpd2pi Qpi,Wpd (66) | vcvtss2si Gy,Wss (F3),(v1) | vcvtsd2si Gy,Wsd (F2),(v1) -2e: vucomiss Vss,Wss (v1) | vucomisd Vsd,Wsd (66),(v1) -2f: vcomiss Vss,Wss (v1) | vcomisd Vsd,Wsd (66),(v1) -# 0x0f 0x30-0x3f -30: WRMSR -31: RDTSC -32: RDMSR -33: RDPMC -34: SYSENTER -35: SYSEXIT -36: -37: GETSEC -38: escape # 3-byte escape 1 -39: -3a: escape # 3-byte escape 2 -3b: -3c: -3d: -3e: -3f: -# 0x0f 0x40-0x4f -40: CMOVO Gv,Ev -41: CMOVNO Gv,Ev | kandw/q Vk,Hk,Uk | kandb/d Vk,Hk,Uk (66) -42: CMOVB/C/NAE Gv,Ev | kandnw/q Vk,Hk,Uk | kandnb/d Vk,Hk,Uk (66) -43: CMOVAE/NB/NC Gv,Ev -44: CMOVE/Z Gv,Ev | knotw/q Vk,Uk | knotb/d Vk,Uk (66) -45: CMOVNE/NZ Gv,Ev | korw/q Vk,Hk,Uk | korb/d Vk,Hk,Uk (66) -46: CMOVBE/NA Gv,Ev | kxnorw/q Vk,Hk,Uk | kxnorb/d Vk,Hk,Uk (66) -47: CMOVA/NBE Gv,Ev | kxorw/q Vk,Hk,Uk | kxorb/d Vk,Hk,Uk (66) -48: CMOVS Gv,Ev -49: CMOVNS Gv,Ev -4a: CMOVP/PE Gv,Ev | kaddw/q Vk,Hk,Uk | kaddb/d Vk,Hk,Uk (66) -4b: CMOVNP/PO Gv,Ev | kunpckbw Vk,Hk,Uk (66) | kunpckwd/dq Vk,Hk,Uk -4c: CMOVL/NGE Gv,Ev -4d: CMOVNL/GE Gv,Ev -4e: CMOVLE/NG Gv,Ev -4f: CMOVNLE/G Gv,Ev -# 0x0f 0x50-0x5f -50: vmovmskps Gy,Ups | vmovmskpd Gy,Upd (66) -51: vsqrtps Vps,Wps | vsqrtpd Vpd,Wpd (66) | vsqrtss Vss,Hss,Wss (F3),(v1) | vsqrtsd Vsd,Hsd,Wsd (F2),(v1) -52: vrsqrtps Vps,Wps | vrsqrtss Vss,Hss,Wss (F3),(v1) -53: vrcpps Vps,Wps | vrcpss Vss,Hss,Wss (F3),(v1) -54: vandps Vps,Hps,Wps | vandpd Vpd,Hpd,Wpd (66) -55: vandnps Vps,Hps,Wps | vandnpd Vpd,Hpd,Wpd (66) -56: vorps Vps,Hps,Wps | vorpd Vpd,Hpd,Wpd (66) -57: vxorps Vps,Hps,Wps | vxorpd Vpd,Hpd,Wpd (66) -58: vaddps Vps,Hps,Wps | vaddpd Vpd,Hpd,Wpd (66) | vaddss Vss,Hss,Wss (F3),(v1) | vaddsd Vsd,Hsd,Wsd (F2),(v1) -59: vmulps Vps,Hps,Wps | vmulpd Vpd,Hpd,Wpd (66) | vmulss Vss,Hss,Wss (F3),(v1) | vmulsd Vsd,Hsd,Wsd (F2),(v1) -5a: vcvtps2pd Vpd,Wps | vcvtpd2ps Vps,Wpd (66) | vcvtss2sd Vsd,Hx,Wss (F3),(v1) | vcvtsd2ss Vss,Hx,Wsd (F2),(v1) -5b: vcvtdq2ps Vps,Wdq | vcvtqq2ps Vps,Wqq (evo) | vcvtps2dq Vdq,Wps (66) | vcvttps2dq Vdq,Wps (F3) -5c: vsubps Vps,Hps,Wps | vsubpd Vpd,Hpd,Wpd (66) | vsubss Vss,Hss,Wss (F3),(v1) | vsubsd Vsd,Hsd,Wsd (F2),(v1) -5d: vminps Vps,Hps,Wps | vminpd Vpd,Hpd,Wpd (66) | vminss Vss,Hss,Wss (F3),(v1) | vminsd Vsd,Hsd,Wsd (F2),(v1) -5e: vdivps Vps,Hps,Wps | vdivpd Vpd,Hpd,Wpd (66) | vdivss Vss,Hss,Wss (F3),(v1) | vdivsd Vsd,Hsd,Wsd (F2),(v1) -5f: vmaxps Vps,Hps,Wps | vmaxpd Vpd,Hpd,Wpd (66) | vmaxss Vss,Hss,Wss (F3),(v1) | vmaxsd Vsd,Hsd,Wsd (F2),(v1) -# 0x0f 0x60-0x6f -60: punpcklbw Pq,Qd | vpunpcklbw Vx,Hx,Wx (66),(v1) -61: punpcklwd Pq,Qd | vpunpcklwd Vx,Hx,Wx (66),(v1) -62: punpckldq Pq,Qd | vpunpckldq Vx,Hx,Wx (66),(v1) -63: packsswb Pq,Qq | vpacksswb Vx,Hx,Wx (66),(v1) -64: pcmpgtb Pq,Qq | vpcmpgtb Vx,Hx,Wx (66),(v1) -65: pcmpgtw Pq,Qq | vpcmpgtw Vx,Hx,Wx (66),(v1) -66: pcmpgtd Pq,Qq | vpcmpgtd Vx,Hx,Wx (66),(v1) -67: packuswb Pq,Qq | vpackuswb Vx,Hx,Wx (66),(v1) -68: punpckhbw Pq,Qd | vpunpckhbw Vx,Hx,Wx (66),(v1) -69: punpckhwd Pq,Qd | vpunpckhwd Vx,Hx,Wx (66),(v1) -6a: punpckhdq Pq,Qd | vpunpckhdq Vx,Hx,Wx (66),(v1) -6b: packssdw Pq,Qd | vpackssdw Vx,Hx,Wx (66),(v1) -6c: vpunpcklqdq Vx,Hx,Wx (66),(v1) -6d: vpunpckhqdq Vx,Hx,Wx (66),(v1) -6e: movd/q Pd,Ey | vmovd/q Vy,Ey (66),(v1) -6f: movq Pq,Qq | vmovdqa Vx,Wx (66) | vmovdqa32/64 Vx,Wx (66),(evo) | vmovdqu Vx,Wx (F3) | vmovdqu32/64 Vx,Wx (F3),(evo) | vmovdqu8/16 Vx,Wx (F2),(ev) -# 0x0f 0x70-0x7f -70: pshufw Pq,Qq,Ib | vpshufd Vx,Wx,Ib (66),(v1) | vpshufhw Vx,Wx,Ib (F3),(v1) | vpshuflw Vx,Wx,Ib (F2),(v1) -71: Grp12 (1A) -72: Grp13 (1A) -73: Grp14 (1A) -74: pcmpeqb Pq,Qq | vpcmpeqb Vx,Hx,Wx (66),(v1) -75: pcmpeqw Pq,Qq | vpcmpeqw Vx,Hx,Wx (66),(v1) -76: pcmpeqd Pq,Qq | vpcmpeqd Vx,Hx,Wx (66),(v1) -# Note: Remove (v), because vzeroall and vzeroupper becomes emms without VEX. -77: emms | vzeroupper | vzeroall -78: VMREAD Ey,Gy | vcvttps2udq/pd2udq Vx,Wpd (evo) | vcvttsd2usi Gv,Wx (F2),(ev) | vcvttss2usi Gv,Wx (F3),(ev) | vcvttps2uqq/pd2uqq Vx,Wx (66),(ev) -79: VMWRITE Gy,Ey | vcvtps2udq/pd2udq Vx,Wpd (evo) | vcvtsd2usi Gv,Wx (F2),(ev) | vcvtss2usi Gv,Wx (F3),(ev) | vcvtps2uqq/pd2uqq Vx,Wx (66),(ev) -7a: vcvtudq2pd/uqq2pd Vpd,Wx (F3),(ev) | vcvtudq2ps/uqq2ps Vpd,Wx (F2),(ev) | vcvttps2qq/pd2qq Vx,Wx (66),(ev) -7b: vcvtusi2sd Vpd,Hpd,Ev (F2),(ev) | vcvtusi2ss Vps,Hps,Ev (F3),(ev) | vcvtps2qq/pd2qq Vx,Wx (66),(ev) -7c: vhaddpd Vpd,Hpd,Wpd (66) | vhaddps Vps,Hps,Wps (F2) -7d: vhsubpd Vpd,Hpd,Wpd (66) | vhsubps Vps,Hps,Wps (F2) -7e: movd/q Ey,Pd | vmovd/q Ey,Vy (66),(v1) | vmovq Vq,Wq (F3),(v1) -7f: movq Qq,Pq | vmovdqa Wx,Vx (66) | vmovdqa32/64 Wx,Vx (66),(evo) | vmovdqu Wx,Vx (F3) | vmovdqu32/64 Wx,Vx (F3),(evo) | vmovdqu8/16 Wx,Vx (F2),(ev) -# 0x0f 0x80-0x8f -# Note: "forced64" is Intel CPU behavior (see comment about CALL insn). -80: JO Jz (f64) -81: JNO Jz (f64) -82: JB/JC/JNAE Jz (f64) -83: JAE/JNB/JNC Jz (f64) -84: JE/JZ Jz (f64) -85: JNE/JNZ Jz (f64) -86: JBE/JNA Jz (f64) -87: JA/JNBE Jz (f64) -88: JS Jz (f64) -89: JNS Jz (f64) -8a: JP/JPE Jz (f64) -8b: JNP/JPO Jz (f64) -8c: JL/JNGE Jz (f64) -8d: JNL/JGE Jz (f64) -8e: JLE/JNG Jz (f64) -8f: JNLE/JG Jz (f64) -# 0x0f 0x90-0x9f -90: SETO Eb | kmovw/q Vk,Wk | kmovb/d Vk,Wk (66) -91: SETNO Eb | kmovw/q Mv,Vk | kmovb/d Mv,Vk (66) -92: SETB/C/NAE Eb | kmovw Vk,Rv | kmovb Vk,Rv (66) | kmovq/d Vk,Rv (F2) -93: SETAE/NB/NC Eb | kmovw Gv,Uk | kmovb Gv,Uk (66) | kmovq/d Gv,Uk (F2) -94: SETE/Z Eb -95: SETNE/NZ Eb -96: SETBE/NA Eb -97: SETA/NBE Eb -98: SETS Eb | kortestw/q Vk,Uk | kortestb/d Vk,Uk (66) -99: SETNS Eb | ktestw/q Vk,Uk | ktestb/d Vk,Uk (66) -9a: SETP/PE Eb -9b: SETNP/PO Eb -9c: SETL/NGE Eb -9d: SETNL/GE Eb -9e: SETLE/NG Eb -9f: SETNLE/G Eb -# 0x0f 0xa0-0xaf -a0: PUSH FS (d64) -a1: POP FS (d64) -a2: CPUID -a3: BT Ev,Gv -a4: SHLD Ev,Gv,Ib -a5: SHLD Ev,Gv,CL -a6: GrpPDLK -a7: GrpRNG -a8: PUSH GS (d64) -a9: POP GS (d64) -aa: RSM -ab: BTS Ev,Gv -ac: SHRD Ev,Gv,Ib -ad: SHRD Ev,Gv,CL -ae: Grp15 (1A),(1C) -af: IMUL Gv,Ev -# 0x0f 0xb0-0xbf -b0: CMPXCHG Eb,Gb -b1: CMPXCHG Ev,Gv -b2: LSS Gv,Mp -b3: BTR Ev,Gv -b4: LFS Gv,Mp -b5: LGS Gv,Mp -b6: MOVZX Gv,Eb -b7: MOVZX Gv,Ew -b8: JMPE (!F3) | POPCNT Gv,Ev (F3) -b9: Grp10 (1A) -ba: Grp8 Ev,Ib (1A) -bb: BTC Ev,Gv -bc: BSF Gv,Ev (!F3) | TZCNT Gv,Ev (F3) -bd: BSR Gv,Ev (!F3) | LZCNT Gv,Ev (F3) -be: MOVSX Gv,Eb -bf: MOVSX Gv,Ew -# 0x0f 0xc0-0xcf -c0: XADD Eb,Gb -c1: XADD Ev,Gv -c2: vcmpps Vps,Hps,Wps,Ib | vcmppd Vpd,Hpd,Wpd,Ib (66) | vcmpss Vss,Hss,Wss,Ib (F3),(v1) | vcmpsd Vsd,Hsd,Wsd,Ib (F2),(v1) -c3: movnti My,Gy -c4: pinsrw Pq,Ry/Mw,Ib | vpinsrw Vdq,Hdq,Ry/Mw,Ib (66),(v1) -c5: pextrw Gd,Nq,Ib | vpextrw Gd,Udq,Ib (66),(v1) -c6: vshufps Vps,Hps,Wps,Ib | vshufpd Vpd,Hpd,Wpd,Ib (66) -c7: Grp9 (1A) -c8: BSWAP RAX/EAX/R8/R8D -c9: BSWAP RCX/ECX/R9/R9D -ca: BSWAP RDX/EDX/R10/R10D -cb: BSWAP RBX/EBX/R11/R11D -cc: BSWAP RSP/ESP/R12/R12D -cd: BSWAP RBP/EBP/R13/R13D -ce: BSWAP RSI/ESI/R14/R14D -cf: BSWAP RDI/EDI/R15/R15D -# 0x0f 0xd0-0xdf -d0: vaddsubpd Vpd,Hpd,Wpd (66) | vaddsubps Vps,Hps,Wps (F2) -d1: psrlw Pq,Qq | vpsrlw Vx,Hx,Wx (66),(v1) -d2: psrld Pq,Qq | vpsrld Vx,Hx,Wx (66),(v1) -d3: psrlq Pq,Qq | vpsrlq Vx,Hx,Wx (66),(v1) -d4: paddq Pq,Qq | vpaddq Vx,Hx,Wx (66),(v1) -d5: pmullw Pq,Qq | vpmullw Vx,Hx,Wx (66),(v1) -d6: vmovq Wq,Vq (66),(v1) | movq2dq Vdq,Nq (F3) | movdq2q Pq,Uq (F2) -d7: pmovmskb Gd,Nq | vpmovmskb Gd,Ux (66),(v1) -d8: psubusb Pq,Qq | vpsubusb Vx,Hx,Wx (66),(v1) -d9: psubusw Pq,Qq | vpsubusw Vx,Hx,Wx (66),(v1) -da: pminub Pq,Qq | vpminub Vx,Hx,Wx (66),(v1) -db: pand Pq,Qq | vpand Vx,Hx,Wx (66),(v1) | vpandd/q Vx,Hx,Wx (66),(evo) -dc: paddusb Pq,Qq | vpaddusb Vx,Hx,Wx (66),(v1) -dd: paddusw Pq,Qq | vpaddusw Vx,Hx,Wx (66),(v1) -de: pmaxub Pq,Qq | vpmaxub Vx,Hx,Wx (66),(v1) -df: pandn Pq,Qq | vpandn Vx,Hx,Wx (66),(v1) | vpandnd/q Vx,Hx,Wx (66),(evo) -# 0x0f 0xe0-0xef -e0: pavgb Pq,Qq | vpavgb Vx,Hx,Wx (66),(v1) -e1: psraw Pq,Qq | vpsraw Vx,Hx,Wx (66),(v1) -e2: psrad Pq,Qq | vpsrad Vx,Hx,Wx (66),(v1) -e3: pavgw Pq,Qq | vpavgw Vx,Hx,Wx (66),(v1) -e4: pmulhuw Pq,Qq | vpmulhuw Vx,Hx,Wx (66),(v1) -e5: pmulhw Pq,Qq | vpmulhw Vx,Hx,Wx (66),(v1) -e6: vcvttpd2dq Vx,Wpd (66) | vcvtdq2pd Vx,Wdq (F3) | vcvtdq2pd/qq2pd Vx,Wdq (F3),(evo) | vcvtpd2dq Vx,Wpd (F2) -e7: movntq Mq,Pq | vmovntdq Mx,Vx (66) -e8: psubsb Pq,Qq | vpsubsb Vx,Hx,Wx (66),(v1) -e9: psubsw Pq,Qq | vpsubsw Vx,Hx,Wx (66),(v1) -ea: pminsw Pq,Qq | vpminsw Vx,Hx,Wx (66),(v1) -eb: por Pq,Qq | vpor Vx,Hx,Wx (66),(v1) | vpord/q Vx,Hx,Wx (66),(evo) -ec: paddsb Pq,Qq | vpaddsb Vx,Hx,Wx (66),(v1) -ed: paddsw Pq,Qq | vpaddsw Vx,Hx,Wx (66),(v1) -ee: pmaxsw Pq,Qq | vpmaxsw Vx,Hx,Wx (66),(v1) -ef: pxor Pq,Qq | vpxor Vx,Hx,Wx (66),(v1) | vpxord/q Vx,Hx,Wx (66),(evo) -# 0x0f 0xf0-0xff -f0: vlddqu Vx,Mx (F2) -f1: psllw Pq,Qq | vpsllw Vx,Hx,Wx (66),(v1) -f2: pslld Pq,Qq | vpslld Vx,Hx,Wx (66),(v1) -f3: psllq Pq,Qq | vpsllq Vx,Hx,Wx (66),(v1) -f4: pmuludq Pq,Qq | vpmuludq Vx,Hx,Wx (66),(v1) -f5: pmaddwd Pq,Qq | vpmaddwd Vx,Hx,Wx (66),(v1) -f6: psadbw Pq,Qq | vpsadbw Vx,Hx,Wx (66),(v1) -f7: maskmovq Pq,Nq | vmaskmovdqu Vx,Ux (66),(v1) -f8: psubb Pq,Qq | vpsubb Vx,Hx,Wx (66),(v1) -f9: psubw Pq,Qq | vpsubw Vx,Hx,Wx (66),(v1) -fa: psubd Pq,Qq | vpsubd Vx,Hx,Wx (66),(v1) -fb: psubq Pq,Qq | vpsubq Vx,Hx,Wx (66),(v1) -fc: paddb Pq,Qq | vpaddb Vx,Hx,Wx (66),(v1) -fd: paddw Pq,Qq | vpaddw Vx,Hx,Wx (66),(v1) -fe: paddd Pq,Qq | vpaddd Vx,Hx,Wx (66),(v1) -ff: -EndTable - -Table: 3-byte opcode 1 (0x0f 0x38) -Referrer: 3-byte escape 1 -AVXcode: 2 -# 0x0f 0x38 0x00-0x0f -00: pshufb Pq,Qq | vpshufb Vx,Hx,Wx (66),(v1) -01: phaddw Pq,Qq | vphaddw Vx,Hx,Wx (66),(v1) -02: phaddd Pq,Qq | vphaddd Vx,Hx,Wx (66),(v1) -03: phaddsw Pq,Qq | vphaddsw Vx,Hx,Wx (66),(v1) -04: pmaddubsw Pq,Qq | vpmaddubsw Vx,Hx,Wx (66),(v1) -05: phsubw Pq,Qq | vphsubw Vx,Hx,Wx (66),(v1) -06: phsubd Pq,Qq | vphsubd Vx,Hx,Wx (66),(v1) -07: phsubsw Pq,Qq | vphsubsw Vx,Hx,Wx (66),(v1) -08: psignb Pq,Qq | vpsignb Vx,Hx,Wx (66),(v1) -09: psignw Pq,Qq | vpsignw Vx,Hx,Wx (66),(v1) -0a: psignd Pq,Qq | vpsignd Vx,Hx,Wx (66),(v1) -0b: pmulhrsw Pq,Qq | vpmulhrsw Vx,Hx,Wx (66),(v1) -0c: vpermilps Vx,Hx,Wx (66),(v) -0d: vpermilpd Vx,Hx,Wx (66),(v) -0e: vtestps Vx,Wx (66),(v) -0f: vtestpd Vx,Wx (66),(v) -# 0x0f 0x38 0x10-0x1f -10: pblendvb Vdq,Wdq (66) | vpsrlvw Vx,Hx,Wx (66),(evo) | vpmovuswb Wx,Vx (F3),(ev) -11: vpmovusdb Wx,Vd (F3),(ev) | vpsravw Vx,Hx,Wx (66),(ev) -12: vpmovusqb Wx,Vq (F3),(ev) | vpsllvw Vx,Hx,Wx (66),(ev) -13: vcvtph2ps Vx,Wx (66),(v) | vpmovusdw Wx,Vd (F3),(ev) -14: blendvps Vdq,Wdq (66) | vpmovusqw Wx,Vq (F3),(ev) | vprorvd/q Vx,Hx,Wx (66),(evo) -15: blendvpd Vdq,Wdq (66) | vpmovusqd Wx,Vq (F3),(ev) | vprolvd/q Vx,Hx,Wx (66),(evo) -16: vpermps Vqq,Hqq,Wqq (66),(v) | vpermps/d Vqq,Hqq,Wqq (66),(evo) -17: vptest Vx,Wx (66) -18: vbroadcastss Vx,Wd (66),(v) -19: vbroadcastsd Vqq,Wq (66),(v) | vbroadcastf32x2 Vqq,Wq (66),(evo) -1a: vbroadcastf128 Vqq,Mdq (66),(v) | vbroadcastf32x4/64x2 Vqq,Wq (66),(evo) -1b: vbroadcastf32x8/64x4 Vqq,Mdq (66),(ev) -1c: pabsb Pq,Qq | vpabsb Vx,Wx (66),(v1) -1d: pabsw Pq,Qq | vpabsw Vx,Wx (66),(v1) -1e: pabsd Pq,Qq | vpabsd Vx,Wx (66),(v1) -1f: vpabsq Vx,Wx (66),(ev) -# 0x0f 0x38 0x20-0x2f -20: vpmovsxbw Vx,Ux/Mq (66),(v1) | vpmovswb Wx,Vx (F3),(ev) -21: vpmovsxbd Vx,Ux/Md (66),(v1) | vpmovsdb Wx,Vd (F3),(ev) -22: vpmovsxbq Vx,Ux/Mw (66),(v1) | vpmovsqb Wx,Vq (F3),(ev) -23: vpmovsxwd Vx,Ux/Mq (66),(v1) | vpmovsdw Wx,Vd (F3),(ev) -24: vpmovsxwq Vx,Ux/Md (66),(v1) | vpmovsqw Wx,Vq (F3),(ev) -25: vpmovsxdq Vx,Ux/Mq (66),(v1) | vpmovsqd Wx,Vq (F3),(ev) -26: vptestmb/w Vk,Hx,Wx (66),(ev) | vptestnmb/w Vk,Hx,Wx (F3),(ev) -27: vptestmd/q Vk,Hx,Wx (66),(ev) | vptestnmd/q Vk,Hx,Wx (F3),(ev) -28: vpmuldq Vx,Hx,Wx (66),(v1) | vpmovm2b/w Vx,Uk (F3),(ev) -29: vpcmpeqq Vx,Hx,Wx (66),(v1) | vpmovb2m/w2m Vk,Ux (F3),(ev) -2a: vmovntdqa Vx,Mx (66),(v1) | vpbroadcastmb2q Vx,Uk (F3),(ev) -2b: vpackusdw Vx,Hx,Wx (66),(v1) -2c: vmaskmovps Vx,Hx,Mx (66),(v) | vscalefps/d Vx,Hx,Wx (66),(evo) -2d: vmaskmovpd Vx,Hx,Mx (66),(v) | vscalefss/d Vx,Hx,Wx (66),(evo) -2e: vmaskmovps Mx,Hx,Vx (66),(v) -2f: vmaskmovpd Mx,Hx,Vx (66),(v) -# 0x0f 0x38 0x30-0x3f -30: vpmovzxbw Vx,Ux/Mq (66),(v1) | vpmovwb Wx,Vx (F3),(ev) -31: vpmovzxbd Vx,Ux/Md (66),(v1) | vpmovdb Wx,Vd (F3),(ev) -32: vpmovzxbq Vx,Ux/Mw (66),(v1) | vpmovqb Wx,Vq (F3),(ev) -33: vpmovzxwd Vx,Ux/Mq (66),(v1) | vpmovdw Wx,Vd (F3),(ev) -34: vpmovzxwq Vx,Ux/Md (66),(v1) | vpmovqw Wx,Vq (F3),(ev) -35: vpmovzxdq Vx,Ux/Mq (66),(v1) | vpmovqd Wx,Vq (F3),(ev) -36: vpermd Vqq,Hqq,Wqq (66),(v) | vpermd/q Vqq,Hqq,Wqq (66),(evo) -37: vpcmpgtq Vx,Hx,Wx (66),(v1) -38: vpminsb Vx,Hx,Wx (66),(v1) | vpmovm2d/q Vx,Uk (F3),(ev) -39: vpminsd Vx,Hx,Wx (66),(v1) | vpminsd/q Vx,Hx,Wx (66),(evo) | vpmovd2m/q2m Vk,Ux (F3),(ev) -3a: vpminuw Vx,Hx,Wx (66),(v1) | vpbroadcastmw2d Vx,Uk (F3),(ev) -3b: vpminud Vx,Hx,Wx (66),(v1) | vpminud/q Vx,Hx,Wx (66),(evo) -3c: vpmaxsb Vx,Hx,Wx (66),(v1) -3d: vpmaxsd Vx,Hx,Wx (66),(v1) | vpmaxsd/q Vx,Hx,Wx (66),(evo) -3e: vpmaxuw Vx,Hx,Wx (66),(v1) -3f: vpmaxud Vx,Hx,Wx (66),(v1) | vpmaxud/q Vx,Hx,Wx (66),(evo) -# 0x0f 0x38 0x40-0x8f -40: vpmulld Vx,Hx,Wx (66),(v1) | vpmulld/q Vx,Hx,Wx (66),(evo) -41: vphminposuw Vdq,Wdq (66),(v1) -42: vgetexpps/d Vx,Wx (66),(ev) -43: vgetexpss/d Vx,Hx,Wx (66),(ev) -44: vplzcntd/q Vx,Wx (66),(ev) -45: vpsrlvd/q Vx,Hx,Wx (66),(v) -46: vpsravd Vx,Hx,Wx (66),(v) | vpsravd/q Vx,Hx,Wx (66),(evo) -47: vpsllvd/q Vx,Hx,Wx (66),(v) -# Skip 0x48-0x4b -4c: vrcp14ps/d Vpd,Wpd (66),(ev) -4d: vrcp14ss/d Vsd,Hpd,Wsd (66),(ev) -4e: vrsqrt14ps/d Vpd,Wpd (66),(ev) -4f: vrsqrt14ss/d Vsd,Hsd,Wsd (66),(ev) -# Skip 0x50-0x57 -58: vpbroadcastd Vx,Wx (66),(v) -59: vpbroadcastq Vx,Wx (66),(v) | vbroadcasti32x2 Vx,Wx (66),(evo) -5a: vbroadcasti128 Vqq,Mdq (66),(v) | vbroadcasti32x4/64x2 Vx,Wx (66),(evo) -5b: vbroadcasti32x8/64x4 Vqq,Mdq (66),(ev) -# Skip 0x5c-0x63 -64: vpblendmd/q Vx,Hx,Wx (66),(ev) -65: vblendmps/d Vx,Hx,Wx (66),(ev) -66: vpblendmb/w Vx,Hx,Wx (66),(ev) -# Skip 0x67-0x74 -75: vpermi2b/w Vx,Hx,Wx (66),(ev) -76: vpermi2d/q Vx,Hx,Wx (66),(ev) -77: vpermi2ps/d Vx,Hx,Wx (66),(ev) -78: vpbroadcastb Vx,Wx (66),(v) -79: vpbroadcastw Vx,Wx (66),(v) -7a: vpbroadcastb Vx,Rv (66),(ev) -7b: vpbroadcastw Vx,Rv (66),(ev) -7c: vpbroadcastd/q Vx,Rv (66),(ev) -7d: vpermt2b/w Vx,Hx,Wx (66),(ev) -7e: vpermt2d/q Vx,Hx,Wx (66),(ev) -7f: vpermt2ps/d Vx,Hx,Wx (66),(ev) -80: INVEPT Gy,Mdq (66) -81: INVPID Gy,Mdq (66) -82: INVPCID Gy,Mdq (66) -83: vpmultishiftqb Vx,Hx,Wx (66),(ev) -88: vexpandps/d Vpd,Wpd (66),(ev) -89: vpexpandd/q Vx,Wx (66),(ev) -8a: vcompressps/d Wx,Vx (66),(ev) -8b: vpcompressd/q Wx,Vx (66),(ev) -8c: vpmaskmovd/q Vx,Hx,Mx (66),(v) -8d: vpermb/w Vx,Hx,Wx (66),(ev) -8e: vpmaskmovd/q Mx,Vx,Hx (66),(v) -# 0x0f 0x38 0x90-0xbf (FMA) -90: vgatherdd/q Vx,Hx,Wx (66),(v) | vpgatherdd/q Vx,Wx (66),(evo) -91: vgatherqd/q Vx,Hx,Wx (66),(v) | vpgatherqd/q Vx,Wx (66),(evo) -92: vgatherdps/d Vx,Hx,Wx (66),(v) -93: vgatherqps/d Vx,Hx,Wx (66),(v) -94: -95: -96: vfmaddsub132ps/d Vx,Hx,Wx (66),(v) -97: vfmsubadd132ps/d Vx,Hx,Wx (66),(v) -98: vfmadd132ps/d Vx,Hx,Wx (66),(v) -99: vfmadd132ss/d Vx,Hx,Wx (66),(v),(v1) -9a: vfmsub132ps/d Vx,Hx,Wx (66),(v) -9b: vfmsub132ss/d Vx,Hx,Wx (66),(v),(v1) -9c: vfnmadd132ps/d Vx,Hx,Wx (66),(v) -9d: vfnmadd132ss/d Vx,Hx,Wx (66),(v),(v1) -9e: vfnmsub132ps/d Vx,Hx,Wx (66),(v) -9f: vfnmsub132ss/d Vx,Hx,Wx (66),(v),(v1) -a0: vpscatterdd/q Wx,Vx (66),(ev) -a1: vpscatterqd/q Wx,Vx (66),(ev) -a2: vscatterdps/d Wx,Vx (66),(ev) -a3: vscatterqps/d Wx,Vx (66),(ev) -a6: vfmaddsub213ps/d Vx,Hx,Wx (66),(v) -a7: vfmsubadd213ps/d Vx,Hx,Wx (66),(v) -a8: vfmadd213ps/d Vx,Hx,Wx (66),(v) -a9: vfmadd213ss/d Vx,Hx,Wx (66),(v),(v1) -aa: vfmsub213ps/d Vx,Hx,Wx (66),(v) -ab: vfmsub213ss/d Vx,Hx,Wx (66),(v),(v1) -ac: vfnmadd213ps/d Vx,Hx,Wx (66),(v) -ad: vfnmadd213ss/d Vx,Hx,Wx (66),(v),(v1) -ae: vfnmsub213ps/d Vx,Hx,Wx (66),(v) -af: vfnmsub213ss/d Vx,Hx,Wx (66),(v),(v1) -b4: vpmadd52luq Vx,Hx,Wx (66),(ev) -b5: vpmadd52huq Vx,Hx,Wx (66),(ev) -b6: vfmaddsub231ps/d Vx,Hx,Wx (66),(v) -b7: vfmsubadd231ps/d Vx,Hx,Wx (66),(v) -b8: vfmadd231ps/d Vx,Hx,Wx (66),(v) -b9: vfmadd231ss/d Vx,Hx,Wx (66),(v),(v1) -ba: vfmsub231ps/d Vx,Hx,Wx (66),(v) -bb: vfmsub231ss/d Vx,Hx,Wx (66),(v),(v1) -bc: vfnmadd231ps/d Vx,Hx,Wx (66),(v) -bd: vfnmadd231ss/d Vx,Hx,Wx (66),(v),(v1) -be: vfnmsub231ps/d Vx,Hx,Wx (66),(v) -bf: vfnmsub231ss/d Vx,Hx,Wx (66),(v),(v1) -# 0x0f 0x38 0xc0-0xff -c4: vpconflictd/q Vx,Wx (66),(ev) -c6: Grp18 (1A) -c7: Grp19 (1A) -c8: sha1nexte Vdq,Wdq | vexp2ps/d Vx,Wx (66),(ev) -c9: sha1msg1 Vdq,Wdq -ca: sha1msg2 Vdq,Wdq | vrcp28ps/d Vx,Wx (66),(ev) -cb: sha256rnds2 Vdq,Wdq | vrcp28ss/d Vx,Hx,Wx (66),(ev) -cc: sha256msg1 Vdq,Wdq | vrsqrt28ps/d Vx,Wx (66),(ev) -cd: sha256msg2 Vdq,Wdq | vrsqrt28ss/d Vx,Hx,Wx (66),(ev) -db: VAESIMC Vdq,Wdq (66),(v1) -dc: VAESENC Vdq,Hdq,Wdq (66),(v1) -dd: VAESENCLAST Vdq,Hdq,Wdq (66),(v1) -de: VAESDEC Vdq,Hdq,Wdq (66),(v1) -df: VAESDECLAST Vdq,Hdq,Wdq (66),(v1) -f0: MOVBE Gy,My | MOVBE Gw,Mw (66) | CRC32 Gd,Eb (F2) | CRC32 Gd,Eb (66&F2) -f1: MOVBE My,Gy | MOVBE Mw,Gw (66) | CRC32 Gd,Ey (F2) | CRC32 Gd,Ew (66&F2) -f2: ANDN Gy,By,Ey (v) -f3: Grp17 (1A) -f5: BZHI Gy,Ey,By (v) | PEXT Gy,By,Ey (F3),(v) | PDEP Gy,By,Ey (F2),(v) -f6: ADCX Gy,Ey (66) | ADOX Gy,Ey (F3) | MULX By,Gy,rDX,Ey (F2),(v) -f7: BEXTR Gy,Ey,By (v) | SHLX Gy,Ey,By (66),(v) | SARX Gy,Ey,By (F3),(v) | SHRX Gy,Ey,By (F2),(v) -EndTable - -Table: 3-byte opcode 2 (0x0f 0x3a) -Referrer: 3-byte escape 2 -AVXcode: 3 -# 0x0f 0x3a 0x00-0xff -00: vpermq Vqq,Wqq,Ib (66),(v) -01: vpermpd Vqq,Wqq,Ib (66),(v) -02: vpblendd Vx,Hx,Wx,Ib (66),(v) -03: valignd/q Vx,Hx,Wx,Ib (66),(ev) -04: vpermilps Vx,Wx,Ib (66),(v) -05: vpermilpd Vx,Wx,Ib (66),(v) -06: vperm2f128 Vqq,Hqq,Wqq,Ib (66),(v) -07: -08: vroundps Vx,Wx,Ib (66) | vrndscaleps Vx,Wx,Ib (66),(evo) -09: vroundpd Vx,Wx,Ib (66) | vrndscalepd Vx,Wx,Ib (66),(evo) -0a: vroundss Vss,Wss,Ib (66),(v1) | vrndscaless Vx,Hx,Wx,Ib (66),(evo) -0b: vroundsd Vsd,Wsd,Ib (66),(v1) | vrndscalesd Vx,Hx,Wx,Ib (66),(evo) -0c: vblendps Vx,Hx,Wx,Ib (66) -0d: vblendpd Vx,Hx,Wx,Ib (66) -0e: vpblendw Vx,Hx,Wx,Ib (66),(v1) -0f: palignr Pq,Qq,Ib | vpalignr Vx,Hx,Wx,Ib (66),(v1) -14: vpextrb Rd/Mb,Vdq,Ib (66),(v1) -15: vpextrw Rd/Mw,Vdq,Ib (66),(v1) -16: vpextrd/q Ey,Vdq,Ib (66),(v1) -17: vextractps Ed,Vdq,Ib (66),(v1) -18: vinsertf128 Vqq,Hqq,Wqq,Ib (66),(v) | vinsertf32x4/64x2 Vqq,Hqq,Wqq,Ib (66),(evo) -19: vextractf128 Wdq,Vqq,Ib (66),(v) | vextractf32x4/64x2 Wdq,Vqq,Ib (66),(evo) -1a: vinsertf32x8/64x4 Vqq,Hqq,Wqq,Ib (66),(ev) -1b: vextractf32x8/64x4 Wdq,Vqq,Ib (66),(ev) -1d: vcvtps2ph Wx,Vx,Ib (66),(v) -1e: vpcmpud/q Vk,Hd,Wd,Ib (66),(ev) -1f: vpcmpd/q Vk,Hd,Wd,Ib (66),(ev) -20: vpinsrb Vdq,Hdq,Ry/Mb,Ib (66),(v1) -21: vinsertps Vdq,Hdq,Udq/Md,Ib (66),(v1) -22: vpinsrd/q Vdq,Hdq,Ey,Ib (66),(v1) -23: vshuff32x4/64x2 Vx,Hx,Wx,Ib (66),(ev) -25: vpternlogd/q Vx,Hx,Wx,Ib (66),(ev) -26: vgetmantps/d Vx,Wx,Ib (66),(ev) -27: vgetmantss/d Vx,Hx,Wx,Ib (66),(ev) -30: kshiftrb/w Vk,Uk,Ib (66),(v) -31: kshiftrd/q Vk,Uk,Ib (66),(v) -32: kshiftlb/w Vk,Uk,Ib (66),(v) -33: kshiftld/q Vk,Uk,Ib (66),(v) -38: vinserti128 Vqq,Hqq,Wqq,Ib (66),(v) | vinserti32x4/64x2 Vqq,Hqq,Wqq,Ib (66),(evo) -39: vextracti128 Wdq,Vqq,Ib (66),(v) | vextracti32x4/64x2 Wdq,Vqq,Ib (66),(evo) -3a: vinserti32x8/64x4 Vqq,Hqq,Wqq,Ib (66),(ev) -3b: vextracti32x8/64x4 Wdq,Vqq,Ib (66),(ev) -3e: vpcmpub/w Vk,Hk,Wx,Ib (66),(ev) -3f: vpcmpb/w Vk,Hk,Wx,Ib (66),(ev) -40: vdpps Vx,Hx,Wx,Ib (66) -41: vdppd Vdq,Hdq,Wdq,Ib (66),(v1) -42: vmpsadbw Vx,Hx,Wx,Ib (66),(v1) | vdbpsadbw Vx,Hx,Wx,Ib (66),(evo) -43: vshufi32x4/64x2 Vx,Hx,Wx,Ib (66),(ev) -44: vpclmulqdq Vdq,Hdq,Wdq,Ib (66),(v1) -46: vperm2i128 Vqq,Hqq,Wqq,Ib (66),(v) -4a: vblendvps Vx,Hx,Wx,Lx (66),(v) -4b: vblendvpd Vx,Hx,Wx,Lx (66),(v) -4c: vpblendvb Vx,Hx,Wx,Lx (66),(v1) -50: vrangeps/d Vx,Hx,Wx,Ib (66),(ev) -51: vrangess/d Vx,Hx,Wx,Ib (66),(ev) -54: vfixupimmps/d Vx,Hx,Wx,Ib (66),(ev) -55: vfixupimmss/d Vx,Hx,Wx,Ib (66),(ev) -56: vreduceps/d Vx,Wx,Ib (66),(ev) -57: vreducess/d Vx,Hx,Wx,Ib (66),(ev) -60: vpcmpestrm Vdq,Wdq,Ib (66),(v1) -61: vpcmpestri Vdq,Wdq,Ib (66),(v1) -62: vpcmpistrm Vdq,Wdq,Ib (66),(v1) -63: vpcmpistri Vdq,Wdq,Ib (66),(v1) -66: vfpclassps/d Vk,Wx,Ib (66),(ev) -67: vfpclassss/d Vk,Wx,Ib (66),(ev) -cc: sha1rnds4 Vdq,Wdq,Ib -df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1) -f0: RORX Gy,Ey,Ib (F2),(v) -EndTable - -GrpTable: Grp1 -0: ADD -1: OR -2: ADC -3: SBB -4: AND -5: SUB -6: XOR -7: CMP -EndTable - -GrpTable: Grp1A -0: POP -EndTable - -GrpTable: Grp2 -0: ROL -1: ROR -2: RCL -3: RCR -4: SHL/SAL -5: SHR -6: -7: SAR -EndTable - -GrpTable: Grp3_1 -0: TEST Eb,Ib -1: -2: NOT Eb -3: NEG Eb -4: MUL AL,Eb -5: IMUL AL,Eb -6: DIV AL,Eb -7: IDIV AL,Eb -EndTable - -GrpTable: Grp3_2 -0: TEST Ev,Iz -1: -2: NOT Ev -3: NEG Ev -4: MUL rAX,Ev -5: IMUL rAX,Ev -6: DIV rAX,Ev -7: IDIV rAX,Ev -EndTable - -GrpTable: Grp4 -0: INC Eb -1: DEC Eb -EndTable - -GrpTable: Grp5 -0: INC Ev -1: DEC Ev -# Note: "forced64" is Intel CPU behavior (see comment about CALL insn). -2: CALLN Ev (f64) -3: CALLF Ep -4: JMPN Ev (f64) -5: JMPF Mp -6: PUSH Ev (d64) -7: -EndTable - -GrpTable: Grp6 -0: SLDT Rv/Mw -1: STR Rv/Mw -2: LLDT Ew -3: LTR Ew -4: VERR Ew -5: VERW Ew -EndTable - -GrpTable: Grp7 -0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) -1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) -2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B) | VMFUNC (100),(11B) | XEND (101)(11B) | XTEST (110)(11B) -3: LIDT Ms -4: SMSW Mw/Rv -5: rdpkru (110),(11B) | wrpkru (111),(11B) -6: LMSW Ew -7: INVLPG Mb | SWAPGS (o64),(000),(11B) | RDTSCP (001),(11B) -EndTable - -GrpTable: Grp8 -4: BT -5: BTS -6: BTR -7: BTC -EndTable - -GrpTable: Grp9 -1: CMPXCHG8B/16B Mq/Mdq -3: xrstors -4: xsavec -5: xsaves -6: VMPTRLD Mq | VMCLEAR Mq (66) | VMXON Mq (F3) | RDRAND Rv (11B) -7: VMPTRST Mq | VMPTRST Mq (F3) | RDSEED Rv (11B) -EndTable - -GrpTable: Grp10 -EndTable - -# Grp11A and Grp11B are expressed as Grp11 in Intel SDM -GrpTable: Grp11A -0: MOV Eb,Ib -7: XABORT Ib (000),(11B) -EndTable - -GrpTable: Grp11B -0: MOV Eb,Iz -7: XBEGIN Jz (000),(11B) -EndTable - -GrpTable: Grp12 -2: psrlw Nq,Ib (11B) | vpsrlw Hx,Ux,Ib (66),(11B),(v1) -4: psraw Nq,Ib (11B) | vpsraw Hx,Ux,Ib (66),(11B),(v1) -6: psllw Nq,Ib (11B) | vpsllw Hx,Ux,Ib (66),(11B),(v1) -EndTable - -GrpTable: Grp13 -0: vprord/q Hx,Wx,Ib (66),(ev) -1: vprold/q Hx,Wx,Ib (66),(ev) -2: psrld Nq,Ib (11B) | vpsrld Hx,Ux,Ib (66),(11B),(v1) -4: psrad Nq,Ib (11B) | vpsrad Hx,Ux,Ib (66),(11B),(v1) | vpsrad/q Hx,Ux,Ib (66),(evo) -6: pslld Nq,Ib (11B) | vpslld Hx,Ux,Ib (66),(11B),(v1) -EndTable - -GrpTable: Grp14 -2: psrlq Nq,Ib (11B) | vpsrlq Hx,Ux,Ib (66),(11B),(v1) -3: vpsrldq Hx,Ux,Ib (66),(11B),(v1) -6: psllq Nq,Ib (11B) | vpsllq Hx,Ux,Ib (66),(11B),(v1) -7: vpslldq Hx,Ux,Ib (66),(11B),(v1) -EndTable - -GrpTable: Grp15 -0: fxsave | RDFSBASE Ry (F3),(11B) -1: fxstor | RDGSBASE Ry (F3),(11B) -2: vldmxcsr Md (v1) | WRFSBASE Ry (F3),(11B) -3: vstmxcsr Md (v1) | WRGSBASE Ry (F3),(11B) -4: XSAVE | ptwrite Ey (F3),(11B) -5: XRSTOR | lfence (11B) -6: XSAVEOPT | clwb (66) | mfence (11B) -7: clflush | clflushopt (66) | sfence (11B) -EndTable - -GrpTable: Grp16 -0: prefetch NTA -1: prefetch T0 -2: prefetch T1 -3: prefetch T2 -EndTable - -GrpTable: Grp17 -1: BLSR By,Ey (v) -2: BLSMSK By,Ey (v) -3: BLSI By,Ey (v) -EndTable - -GrpTable: Grp18 -1: vgatherpf0dps/d Wx (66),(ev) -2: vgatherpf1dps/d Wx (66),(ev) -5: vscatterpf0dps/d Wx (66),(ev) -6: vscatterpf1dps/d Wx (66),(ev) -EndTable - -GrpTable: Grp19 -1: vgatherpf0qps/d Wx (66),(ev) -2: vgatherpf1qps/d Wx (66),(ev) -5: vscatterpf0qps/d Wx (66),(ev) -6: vscatterpf1qps/d Wx (66),(ev) -EndTable - -# AMD's Prefetch Group -GrpTable: GrpP -0: PREFETCH -1: PREFETCHW -EndTable - -GrpTable: GrpPDLK -0: MONTMUL -1: XSHA1 -2: XSHA2 -EndTable - -GrpTable: GrpRNG -0: xstore-rng -1: xcrypt-ecb -2: xcrypt-cbc -4: xcrypt-cfb -5: xcrypt-ofb -EndTable diff --git a/tools/objtool/arch/x86/lib/inat.c b/tools/objtool/arch/x86/lib/inat.c new file mode 100644 index 000000000000..c1f01a8e9f65 --- /dev/null +++ b/tools/objtool/arch/x86/lib/inat.c @@ -0,0 +1,97 @@ +/* + * x86 instruction attribute tables + * + * Written by Masami Hiramatsu <mhiramat(a)redhat.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + */ +#include <asm/insn.h> + +/* Attribute tables are generated from opcode map */ +#include "inat-tables.c" + +/* Attribute search APIs */ +insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode) +{ + return inat_primary_table[opcode]; +} + +int inat_get_last_prefix_id(insn_byte_t last_pfx) +{ + insn_attr_t lpfx_attr; + + lpfx_attr = inat_get_opcode_attribute(last_pfx); + return inat_last_prefix_id(lpfx_attr); +} + +insn_attr_t inat_get_escape_attribute(insn_byte_t opcode, int lpfx_id, + insn_attr_t esc_attr) +{ + const insn_attr_t *table; + int n; + + n = inat_escape_id(esc_attr); + + table = inat_escape_tables[n][0]; + if (!table) + return 0; + if (inat_has_variant(table[opcode]) && lpfx_id) { + table = inat_escape_tables[n][lpfx_id]; + if (!table) + return 0; + } + return table[opcode]; +} + +insn_attr_t inat_get_group_attribute(insn_byte_t modrm, int lpfx_id, + insn_attr_t grp_attr) +{ + const insn_attr_t *table; + int n; + + n = inat_group_id(grp_attr); + + table = inat_group_tables[n][0]; + if (!table) + return inat_group_common_attribute(grp_attr); + if (inat_has_variant(table[X86_MODRM_REG(modrm)]) && lpfx_id) { + table = inat_group_tables[n][lpfx_id]; + if (!table) + return inat_group_common_attribute(grp_attr); + } + return table[X86_MODRM_REG(modrm)] | + inat_group_common_attribute(grp_attr); +} + +insn_attr_t inat_get_avx_attribute(insn_byte_t opcode, insn_byte_t vex_m, + insn_byte_t vex_p) +{ + const insn_attr_t *table; + if (vex_m > X86_VEX_M_MAX || vex_p > INAT_LSTPFX_MAX) + return 0; + /* At first, this checks the master table */ + table = inat_avx_tables[vex_m][0]; + if (!table) + return 0; + if (!inat_is_group(table[opcode]) && vex_p) { + /* If this is not a group, get attribute directly */ + table = inat_avx_tables[vex_m][vex_p]; + if (!table) + return 0; + } + return table[opcode]; +} + diff --git a/tools/objtool/arch/x86/lib/insn.c b/tools/objtool/arch/x86/lib/insn.c new file mode 100644 index 000000000000..1088eb8f3a5f --- /dev/null +++ b/tools/objtool/arch/x86/lib/insn.c @@ -0,0 +1,606 @@ +/* + * x86 instruction analysis + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2002, 2004, 2009 + */ + +#ifdef __KERNEL__ +#include <linux/string.h> +#else +#include <string.h> +#endif +#include <asm/inat.h> +#include <asm/insn.h> + +/* Verify next sizeof(t) bytes can be on the same instruction */ +#define validate_next(t, insn, n) \ + ((insn)->next_byte + sizeof(t) + n <= (insn)->end_kaddr) + +#define __get_next(t, insn) \ + ({ t r = *(t*)insn->next_byte; insn->next_byte += sizeof(t); r; }) + +#define __peek_nbyte_next(t, insn, n) \ + ({ t r = *(t*)((insn)->next_byte + n); r; }) + +#define get_next(t, insn) \ + ({ if (unlikely(!validate_next(t, insn, 0))) goto err_out; __get_next(t, insn); }) + +#define peek_nbyte_next(t, insn, n) \ + ({ if (unlikely(!validate_next(t, insn, n))) goto err_out; __peek_nbyte_next(t, insn, n); }) + +#define peek_next(t, insn) peek_nbyte_next(t, insn, 0) + +/** + * insn_init() - initialize struct insn + * @insn: &struct insn to be initialized + * @kaddr: address (in kernel memory) of instruction (or copy thereof) + * @x86_64: !0 for 64-bit kernel or 64-bit app + */ +void insn_init(struct insn *insn, const void *kaddr, int buf_len, int x86_64) +{ + /* + * Instructions longer than MAX_INSN_SIZE (15 bytes) are invalid + * even if the input buffer is long enough to hold them. + */ + if (buf_len > MAX_INSN_SIZE) + buf_len = MAX_INSN_SIZE; + + memset(insn, 0, sizeof(*insn)); + insn->kaddr = kaddr; + insn->end_kaddr = kaddr + buf_len; + insn->next_byte = kaddr; + insn->x86_64 = x86_64 ? 1 : 0; + insn->opnd_bytes = 4; + if (x86_64) + insn->addr_bytes = 8; + else + insn->addr_bytes = 4; +} + +/** + * insn_get_prefixes - scan x86 instruction prefix bytes + * @insn: &struct insn containing instruction + * + * Populates the @insn->prefixes bitmap, and updates @insn->next_byte + * to point to the (first) opcode. No effect if @insn->prefixes.got + * is already set. + */ +void insn_get_prefixes(struct insn *insn) +{ + struct insn_field *prefixes = &insn->prefixes; + insn_attr_t attr; + insn_byte_t b, lb; + int i, nb; + + if (prefixes->got) + return; + + nb = 0; + lb = 0; + b = peek_next(insn_byte_t, insn); + attr = inat_get_opcode_attribute(b); + while (inat_is_legacy_prefix(attr)) { + /* Skip if same prefix */ + for (i = 0; i < nb; i++) + if (prefixes->bytes[i] == b) + goto found; + if (nb == 4) + /* Invalid instruction */ + break; + prefixes->bytes[nb++] = b; + if (inat_is_address_size_prefix(attr)) { + /* address size switches 2/4 or 4/8 */ + if (insn->x86_64) + insn->addr_bytes ^= 12; + else + insn->addr_bytes ^= 6; + } else if (inat_is_operand_size_prefix(attr)) { + /* oprand size switches 2/4 */ + insn->opnd_bytes ^= 6; + } +found: + prefixes->nbytes++; + insn->next_byte++; + lb = b; + b = peek_next(insn_byte_t, insn); + attr = inat_get_opcode_attribute(b); + } + /* Set the last prefix */ + if (lb && lb != insn->prefixes.bytes[3]) { + if (unlikely(insn->prefixes.bytes[3])) { + /* Swap the last prefix */ + b = insn->prefixes.bytes[3]; + for (i = 0; i < nb; i++) + if (prefixes->bytes[i] == lb) + prefixes->bytes[i] = b; + } + insn->prefixes.bytes[3] = lb; + } + + /* Decode REX prefix */ + if (insn->x86_64) { + b = peek_next(insn_byte_t, insn); + attr = inat_get_opcode_attribute(b); + if (inat_is_rex_prefix(attr)) { + insn->rex_prefix.value = b; + insn->rex_prefix.nbytes = 1; + insn->next_byte++; + if (X86_REX_W(b)) + /* REX.W overrides opnd_size */ + insn->opnd_bytes = 8; + } + } + insn->rex_prefix.got = 1; + + /* Decode VEX prefix */ + b = peek_next(insn_byte_t, insn); + attr = inat_get_opcode_attribute(b); + if (inat_is_vex_prefix(attr)) { + insn_byte_t b2 = peek_nbyte_next(insn_byte_t, insn, 1); + if (!insn->x86_64) { + /* + * In 32-bits mode, if the [7:6] bits (mod bits of + * ModRM) on the second byte are not 11b, it is + * LDS or LES or BOUND. + */ + if (X86_MODRM_MOD(b2) != 3) + goto vex_end; + } + insn->vex_prefix.bytes[0] = b; + insn->vex_prefix.bytes[1] = b2; + if (inat_is_evex_prefix(attr)) { + b2 = peek_nbyte_next(insn_byte_t, insn, 2); + insn->vex_prefix.bytes[2] = b2; + b2 = peek_nbyte_next(insn_byte_t, insn, 3); + insn->vex_prefix.bytes[3] = b2; + insn->vex_prefix.nbytes = 4; + insn->next_byte += 4; + if (insn->x86_64 && X86_VEX_W(b2)) + /* VEX.W overrides opnd_size */ + insn->opnd_bytes = 8; + } else if (inat_is_vex3_prefix(attr)) { + b2 = peek_nbyte_next(insn_byte_t, insn, 2); + insn->vex_prefix.bytes[2] = b2; + insn->vex_prefix.nbytes = 3; + insn->next_byte += 3; + if (insn->x86_64 && X86_VEX_W(b2)) + /* VEX.W overrides opnd_size */ + insn->opnd_bytes = 8; + } else { + /* + * For VEX2, fake VEX3-like byte#2. + * Makes it easier to decode vex.W, vex.vvvv, + * vex.L and vex.pp. Masking with 0x7f sets vex.W == 0. + */ + insn->vex_prefix.bytes[2] = b2 & 0x7f; + insn->vex_prefix.nbytes = 2; + insn->next_byte += 2; + } + } +vex_end: + insn->vex_prefix.got = 1; + + prefixes->got = 1; + +err_out: + return; +} + +/** + * insn_get_opcode - collect opcode(s) + * @insn: &struct insn containing instruction + * + * Populates @insn->opcode, updates @insn->next_byte to point past the + * opcode byte(s), and set @insn->attr (except for groups). + * If necessary, first collects any preceding (prefix) bytes. + * Sets @insn->opcode.value = opcode1. No effect if @insn->opcode.got + * is already 1. + */ +void insn_get_opcode(struct insn *insn) +{ + struct insn_field *opcode = &insn->opcode; + insn_byte_t op; + int pfx_id; + if (opcode->got) + return; + if (!insn->prefixes.got) + insn_get_prefixes(insn); + + /* Get first opcode */ + op = get_next(insn_byte_t, insn); + opcode->bytes[0] = op; + opcode->nbytes = 1; + + /* Check if there is VEX prefix or not */ + if (insn_is_avx(insn)) { + insn_byte_t m, p; + m = insn_vex_m_bits(insn); + p = insn_vex_p_bits(insn); + insn->attr = inat_get_avx_attribute(op, m, p); + if ((inat_must_evex(insn->attr) && !insn_is_evex(insn)) || + (!inat_accept_vex(insn->attr) && + !inat_is_group(insn->attr))) + insn->attr = 0; /* This instruction is bad */ + goto end; /* VEX has only 1 byte for opcode */ + } + + insn->attr = inat_get_opcode_attribute(op); + while (inat_is_escape(insn->attr)) { + /* Get escaped opcode */ + op = get_next(insn_byte_t, insn); + opcode->bytes[opcode->nbytes++] = op; + pfx_id = insn_last_prefix_id(insn); + insn->attr = inat_get_escape_attribute(op, pfx_id, insn->attr); + } + if (inat_must_vex(insn->attr)) + insn->attr = 0; /* This instruction is bad */ +end: + opcode->got = 1; + +err_out: + return; +} + +/** + * insn_get_modrm - collect ModRM byte, if any + * @insn: &struct insn containing instruction + * + * Populates @insn->modrm and updates @insn->next_byte to point past the + * ModRM byte, if any. If necessary, first collects the preceding bytes + * (prefixes and opcode(s)). No effect if @insn->modrm.got is already 1. + */ +void insn_get_modrm(struct insn *insn) +{ + struct insn_field *modrm = &insn->modrm; + insn_byte_t pfx_id, mod; + if (modrm->got) + return; + if (!insn->opcode.got) + insn_get_opcode(insn); + + if (inat_has_modrm(insn->attr)) { + mod = get_next(insn_byte_t, insn); + modrm->value = mod; + modrm->nbytes = 1; + if (inat_is_group(insn->attr)) { + pfx_id = insn_last_prefix_id(insn); + insn->attr = inat_get_group_attribute(mod, pfx_id, + insn->attr); + if (insn_is_avx(insn) && !inat_accept_vex(insn->attr)) + insn->attr = 0; /* This is bad */ + } + } + + if (insn->x86_64 && inat_is_force64(insn->attr)) + insn->opnd_bytes = 8; + modrm->got = 1; + +err_out: + return; +} + + +/** + * insn_rip_relative() - Does instruction use RIP-relative addressing mode? + * @insn: &struct insn containing instruction + * + * If necessary, first collects the instruction up to and including the + * ModRM byte. No effect if @insn->x86_64 is 0. + */ +int insn_rip_relative(struct insn *insn) +{ + struct insn_field *modrm = &insn->modrm; + + if (!insn->x86_64) + return 0; + if (!modrm->got) + insn_get_modrm(insn); + /* + * For rip-relative instructions, the mod field (top 2 bits) + * is zero and the r/m field (bottom 3 bits) is 0x5. + */ + return (modrm->nbytes && (modrm->value & 0xc7) == 0x5); +} + +/** + * insn_get_sib() - Get the SIB byte of instruction + * @insn: &struct insn containing instruction + * + * If necessary, first collects the instruction up to and including the + * ModRM byte. + */ +void insn_get_sib(struct insn *insn) +{ + insn_byte_t modrm; + + if (insn->sib.got) + return; + if (!insn->modrm.got) + insn_get_modrm(insn); + if (insn->modrm.nbytes) { + modrm = (insn_byte_t)insn->modrm.value; + if (insn->addr_bytes != 2 && + X86_MODRM_MOD(modrm) != 3 && X86_MODRM_RM(modrm) == 4) { + insn->sib.value = get_next(insn_byte_t, insn); + insn->sib.nbytes = 1; + } + } + insn->sib.got = 1; + +err_out: + return; +} + + +/** + * insn_get_displacement() - Get the displacement of instruction + * @insn: &struct insn containing instruction + * + * If necessary, first collects the instruction up to and including the + * SIB byte. + * Displacement value is sign-expanded. + */ +void insn_get_displacement(struct insn *insn) +{ + insn_byte_t mod, rm, base; + + if (insn->displacement.got) + return; + if (!insn->sib.got) + insn_get_sib(insn); + if (insn->modrm.nbytes) { + /* + * Interpreting the modrm byte: + * mod = 00 - no displacement fields (exceptions below) + * mod = 01 - 1-byte displacement field + * mod = 10 - displacement field is 4 bytes, or 2 bytes if + * address size = 2 (0x67 prefix in 32-bit mode) + * mod = 11 - no memory operand + * + * If address size = 2... + * mod = 00, r/m = 110 - displacement field is 2 bytes + * + * If address size != 2... + * mod != 11, r/m = 100 - SIB byte exists + * mod = 00, SIB base = 101 - displacement field is 4 bytes + * mod = 00, r/m = 101 - rip-relative addressing, displacement + * field is 4 bytes + */ + mod = X86_MODRM_MOD(insn->modrm.value); + rm = X86_MODRM_RM(insn->modrm.value); + base = X86_SIB_BASE(insn->sib.value); + if (mod == 3) + goto out; + if (mod == 1) { + insn->displacement.value = get_next(signed char, insn); + insn->displacement.nbytes = 1; + } else if (insn->addr_bytes == 2) { + if ((mod == 0 && rm == 6) || mod == 2) { + insn->displacement.value = + get_next(short, insn); + insn->displacement.nbytes = 2; + } + } else { + if ((mod == 0 && rm == 5) || mod == 2 || + (mod == 0 && base == 5)) { + insn->displacement.value = get_next(int, insn); + insn->displacement.nbytes = 4; + } + } + } +out: + insn->displacement.got = 1; + +err_out: + return; +} + +/* Decode moffset16/32/64. Return 0 if failed */ +static int __get_moffset(struct insn *insn) +{ + switch (insn->addr_bytes) { + case 2: + insn->moffset1.value = get_next(short, insn); + insn->moffset1.nbytes = 2; + break; + case 4: + insn->moffset1.value = get_next(int, insn); + insn->moffset1.nbytes = 4; + break; + case 8: + insn->moffset1.value = get_next(int, insn); + insn->moffset1.nbytes = 4; + insn->moffset2.value = get_next(int, insn); + insn->moffset2.nbytes = 4; + break; + default: /* opnd_bytes must be modified manually */ + goto err_out; + } + insn->moffset1.got = insn->moffset2.got = 1; + + return 1; + +err_out: + return 0; +} + +/* Decode imm v32(Iz). Return 0 if failed */ +static int __get_immv32(struct insn *insn) +{ + switch (insn->opnd_bytes) { + case 2: + insn->immediate.value = get_next(short, insn); + insn->immediate.nbytes = 2; + break; + case 4: + case 8: + insn->immediate.value = get_next(int, insn); + insn->immediate.nbytes = 4; + break; + default: /* opnd_bytes must be modified manually */ + goto err_out; + } + + return 1; + +err_out: + return 0; +} + +/* Decode imm v64(Iv/Ov), Return 0 if failed */ +static int __get_immv(struct insn *insn) +{ + switch (insn->opnd_bytes) { + case 2: + insn->immediate1.value = get_next(short, insn); + insn->immediate1.nbytes = 2; + break; + case 4: + insn->immediate1.value = get_next(int, insn); + insn->immediate1.nbytes = 4; + break; + case 8: + insn->immediate1.value = get_next(int, insn); + insn->immediate1.nbytes = 4; + insn->immediate2.value = get_next(int, insn); + insn->immediate2.nbytes = 4; + break; + default: /* opnd_bytes must be modified manually */ + goto err_out; + } + insn->immediate1.got = insn->immediate2.got = 1; + + return 1; +err_out: + return 0; +} + +/* Decode ptr16:16/32(Ap) */ +static int __get_immptr(struct insn *insn) +{ + switch (insn->opnd_bytes) { + case 2: + insn->immediate1.value = get_next(short, insn); + insn->immediate1.nbytes = 2; + break; + case 4: + insn->immediate1.value = get_next(int, insn); + insn->immediate1.nbytes = 4; + break; + case 8: + /* ptr16:64 is not exist (no segment) */ + return 0; + default: /* opnd_bytes must be modified manually */ + goto err_out; + } + insn->immediate2.value = get_next(unsigned short, insn); + insn->immediate2.nbytes = 2; + insn->immediate1.got = insn->immediate2.got = 1; + + return 1; +err_out: + return 0; +} + +/** + * insn_get_immediate() - Get the immediates of instruction + * @insn: &struct insn containing instruction + * + * If necessary, first collects the instruction up to and including the + * displacement bytes. + * Basically, most of immediates are sign-expanded. Unsigned-value can be + * get by bit masking with ((1 << (nbytes * 8)) - 1) + */ +void insn_get_immediate(struct insn *insn) +{ + if (insn->immediate.got) + return; + if (!insn->displacement.got) + insn_get_displacement(insn); + + if (inat_has_moffset(insn->attr)) { + if (!__get_moffset(insn)) + goto err_out; + goto done; + } + + if (!inat_has_immediate(insn->attr)) + /* no immediates */ + goto done; + + switch (inat_immediate_size(insn->attr)) { + case INAT_IMM_BYTE: + insn->immediate.value = get_next(signed char, insn); + insn->immediate.nbytes = 1; + break; + case INAT_IMM_WORD: + insn->immediate.value = get_next(short, insn); + insn->immediate.nbytes = 2; + break; + case INAT_IMM_DWORD: + insn->immediate.value = get_next(int, insn); + insn->immediate.nbytes = 4; + break; + case INAT_IMM_QWORD: + insn->immediate1.value = get_next(int, insn); + insn->immediate1.nbytes = 4; + insn->immediate2.value = get_next(int, insn); + insn->immediate2.nbytes = 4; + break; + case INAT_IMM_PTR: + if (!__get_immptr(insn)) + goto err_out; + break; + case INAT_IMM_VWORD32: + if (!__get_immv32(insn)) + goto err_out; + break; + case INAT_IMM_VWORD: + if (!__get_immv(insn)) + goto err_out; + break; + default: + /* Here, insn must have an immediate, but failed */ + goto err_out; + } + if (inat_has_second_immediate(insn->attr)) { + insn->immediate2.value = get_next(signed char, insn); + insn->immediate2.nbytes = 1; + } +done: + insn->immediate.got = 1; + +err_out: + return; +} + +/** + * insn_get_length() - Get the length of instruction + * @insn: &struct insn containing instruction + * + * If necessary, first collects the instruction up to and including the + * immediates bytes. + */ +void insn_get_length(struct insn *insn) +{ + if (insn->length) + return; + if (!insn->immediate.got) + insn_get_immediate(insn); + insn->length = (unsigned char)((unsigned long)insn->next_byte + - (unsigned long)insn->kaddr); +} diff --git a/tools/objtool/arch/x86/lib/x86-opcode-map.txt b/tools/objtool/arch/x86/lib/x86-opcode-map.txt new file mode 100644 index 000000000000..e0b85930dd77 --- /dev/null +++ b/tools/objtool/arch/x86/lib/x86-opcode-map.txt @@ -0,0 +1,1072 @@ +# x86 Opcode Maps +# +# This is (mostly) based on following documentations. +# - Intel(R) 64 and IA-32 Architectures Software Developer's Manual Vol.2C +# (#326018-047US, June 2013) +# +#<Opcode maps> +# Table: table-name +# Referrer: escaped-name +# AVXcode: avx-code +# opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...] +# (or) +# opcode: escape # escaped-name +# EndTable +# +# mnemonics that begin with lowercase 'v' accept a VEX or EVEX prefix +# mnemonics that begin with lowercase 'k' accept a VEX prefix +# +#<group maps> +# GrpTable: GrpXXX +# reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...] +# EndTable +# +# AVX Superscripts +# (ev): this opcode requires EVEX prefix. +# (evo): this opcode is changed by EVEX prefix (EVEX opcode) +# (v): this opcode requires VEX prefix. +# (v1): this opcode only supports 128bit VEX. +# +# Last Prefix Superscripts +# - (66): the last prefix is 0x66 +# - (F3): the last prefix is 0xF3 +# - (F2): the last prefix is 0xF2 +# - (!F3) : the last prefix is not 0xF3 (including non-last prefix case) +# - (66&F2): Both 0x66 and 0xF2 prefixes are specified. + +Table: one byte opcode +Referrer: +AVXcode: +# 0x00 - 0x0f +00: ADD Eb,Gb +01: ADD Ev,Gv +02: ADD Gb,Eb +03: ADD Gv,Ev +04: ADD AL,Ib +05: ADD rAX,Iz +06: PUSH ES (i64) +07: POP ES (i64) +08: OR Eb,Gb +09: OR Ev,Gv +0a: OR Gb,Eb +0b: OR Gv,Ev +0c: OR AL,Ib +0d: OR rAX,Iz +0e: PUSH CS (i64) +0f: escape # 2-byte escape +# 0x10 - 0x1f +10: ADC Eb,Gb +11: ADC Ev,Gv +12: ADC Gb,Eb +13: ADC Gv,Ev +14: ADC AL,Ib +15: ADC rAX,Iz +16: PUSH SS (i64) +17: POP SS (i64) +18: SBB Eb,Gb +19: SBB Ev,Gv +1a: SBB Gb,Eb +1b: SBB Gv,Ev +1c: SBB AL,Ib +1d: SBB rAX,Iz +1e: PUSH DS (i64) +1f: POP DS (i64) +# 0x20 - 0x2f +20: AND Eb,Gb +21: AND Ev,Gv +22: AND Gb,Eb +23: AND Gv,Ev +24: AND AL,Ib +25: AND rAx,Iz +26: SEG=ES (Prefix) +27: DAA (i64) +28: SUB Eb,Gb +29: SUB Ev,Gv +2a: SUB Gb,Eb +2b: SUB Gv,Ev +2c: SUB AL,Ib +2d: SUB rAX,Iz +2e: SEG=CS (Prefix) +2f: DAS (i64) +# 0x30 - 0x3f +30: XOR Eb,Gb +31: XOR Ev,Gv +32: XOR Gb,Eb +33: XOR Gv,Ev +34: XOR AL,Ib +35: XOR rAX,Iz +36: SEG=SS (Prefix) +37: AAA (i64) +38: CMP Eb,Gb +39: CMP Ev,Gv +3a: CMP Gb,Eb +3b: CMP Gv,Ev +3c: CMP AL,Ib +3d: CMP rAX,Iz +3e: SEG=DS (Prefix) +3f: AAS (i64) +# 0x40 - 0x4f +40: INC eAX (i64) | REX (o64) +41: INC eCX (i64) | REX.B (o64) +42: INC eDX (i64) | REX.X (o64) +43: INC eBX (i64) | REX.XB (o64) +44: INC eSP (i64) | REX.R (o64) +45: INC eBP (i64) | REX.RB (o64) +46: INC eSI (i64) | REX.RX (o64) +47: INC eDI (i64) | REX.RXB (o64) +48: DEC eAX (i64) | REX.W (o64) +49: DEC eCX (i64) | REX.WB (o64) +4a: DEC eDX (i64) | REX.WX (o64) +4b: DEC eBX (i64) | REX.WXB (o64) +4c: DEC eSP (i64) | REX.WR (o64) +4d: DEC eBP (i64) | REX.WRB (o64) +4e: DEC eSI (i64) | REX.WRX (o64) +4f: DEC eDI (i64) | REX.WRXB (o64) +# 0x50 - 0x5f +50: PUSH rAX/r8 (d64) +51: PUSH rCX/r9 (d64) +52: PUSH rDX/r10 (d64) +53: PUSH rBX/r11 (d64) +54: PUSH rSP/r12 (d64) +55: PUSH rBP/r13 (d64) +56: PUSH rSI/r14 (d64) +57: PUSH rDI/r15 (d64) +58: POP rAX/r8 (d64) +59: POP rCX/r9 (d64) +5a: POP rDX/r10 (d64) +5b: POP rBX/r11 (d64) +5c: POP rSP/r12 (d64) +5d: POP rBP/r13 (d64) +5e: POP rSI/r14 (d64) +5f: POP rDI/r15 (d64) +# 0x60 - 0x6f +60: PUSHA/PUSHAD (i64) +61: POPA/POPAD (i64) +62: BOUND Gv,Ma (i64) | EVEX (Prefix) +63: ARPL Ew,Gw (i64) | MOVSXD Gv,Ev (o64) +64: SEG=FS (Prefix) +65: SEG=GS (Prefix) +66: Operand-Size (Prefix) +67: Address-Size (Prefix) +68: PUSH Iz (d64) +69: IMUL Gv,Ev,Iz +6a: PUSH Ib (d64) +6b: IMUL Gv,Ev,Ib +6c: INS/INSB Yb,DX +6d: INS/INSW/INSD Yz,DX +6e: OUTS/OUTSB DX,Xb +6f: OUTS/OUTSW/OUTSD DX,Xz +# 0x70 - 0x7f +70: JO Jb +71: JNO Jb +72: JB/JNAE/JC Jb +73: JNB/JAE/JNC Jb +74: JZ/JE Jb +75: JNZ/JNE Jb +76: JBE/JNA Jb +77: JNBE/JA Jb +78: JS Jb +79: JNS Jb +7a: JP/JPE Jb +7b: JNP/JPO Jb +7c: JL/JNGE Jb +7d: JNL/JGE Jb +7e: JLE/JNG Jb +7f: JNLE/JG Jb +# 0x80 - 0x8f +80: Grp1 Eb,Ib (1A) +81: Grp1 Ev,Iz (1A) +82: Grp1 Eb,Ib (1A),(i64) +83: Grp1 Ev,Ib (1A) +84: TEST Eb,Gb +85: TEST Ev,Gv +86: XCHG Eb,Gb +87: XCHG Ev,Gv +88: MOV Eb,Gb +89: MOV Ev,Gv +8a: MOV Gb,Eb +8b: MOV Gv,Ev +8c: MOV Ev,Sw +8d: LEA Gv,M +8e: MOV Sw,Ew +8f: Grp1A (1A) | POP Ev (d64) +# 0x90 - 0x9f +90: NOP | PAUSE (F3) | XCHG r8,rAX +91: XCHG rCX/r9,rAX +92: XCHG rDX/r10,rAX +93: XCHG rBX/r11,rAX +94: XCHG rSP/r12,rAX +95: XCHG rBP/r13,rAX +96: XCHG rSI/r14,rAX +97: XCHG rDI/r15,rAX +98: CBW/CWDE/CDQE +99: CWD/CDQ/CQO +9a: CALLF Ap (i64) +9b: FWAIT/WAIT +9c: PUSHF/D/Q Fv (d64) +9d: POPF/D/Q Fv (d64) +9e: SAHF +9f: LAHF +# 0xa0 - 0xaf +a0: MOV AL,Ob +a1: MOV rAX,Ov +a2: MOV Ob,AL +a3: MOV Ov,rAX +a4: MOVS/B Yb,Xb +a5: MOVS/W/D/Q Yv,Xv +a6: CMPS/B Xb,Yb +a7: CMPS/W/D Xv,Yv +a8: TEST AL,Ib +a9: TEST rAX,Iz +aa: STOS/B Yb,AL +ab: STOS/W/D/Q Yv,rAX +ac: LODS/B AL,Xb +ad: LODS/W/D/Q rAX,Xv +ae: SCAS/B AL,Yb +# Note: The May 2011 Intel manual shows Xv for the second parameter of the +# next instruction but Yv is correct +af: SCAS/W/D/Q rAX,Yv +# 0xb0 - 0xbf +b0: MOV AL/R8L,Ib +b1: MOV CL/R9L,Ib +b2: MOV DL/R10L,Ib +b3: MOV BL/R11L,Ib +b4: MOV AH/R12L,Ib +b5: MOV CH/R13L,Ib +b6: MOV DH/R14L,Ib +b7: MOV BH/R15L,Ib +b8: MOV rAX/r8,Iv +b9: MOV rCX/r9,Iv +ba: MOV rDX/r10,Iv +bb: MOV rBX/r11,Iv +bc: MOV rSP/r12,Iv +bd: MOV rBP/r13,Iv +be: MOV rSI/r14,Iv +bf: MOV rDI/r15,Iv +# 0xc0 - 0xcf +c0: Grp2 Eb,Ib (1A) +c1: Grp2 Ev,Ib (1A) +c2: RETN Iw (f64) +c3: RETN +c4: LES Gz,Mp (i64) | VEX+2byte (Prefix) +c5: LDS Gz,Mp (i64) | VEX+1byte (Prefix) +c6: Grp11A Eb,Ib (1A) +c7: Grp11B Ev,Iz (1A) +c8: ENTER Iw,Ib +c9: LEAVE (d64) +ca: RETF Iw +cb: RETF +cc: INT3 +cd: INT Ib +ce: INTO (i64) +cf: IRET/D/Q +# 0xd0 - 0xdf +d0: Grp2 Eb,1 (1A) +d1: Grp2 Ev,1 (1A) +d2: Grp2 Eb,CL (1A) +d3: Grp2 Ev,CL (1A) +d4: AAM Ib (i64) +d5: AAD Ib (i64) +d6: +d7: XLAT/XLATB +d8: ESC +d9: ESC +da: ESC +db: ESC +dc: ESC +dd: ESC +de: ESC +df: ESC +# 0xe0 - 0xef +# Note: "forced64" is Intel CPU behavior: they ignore 0x66 prefix +# in 64-bit mode. AMD CPUs accept 0x66 prefix, it causes RIP truncation +# to 16 bits. In 32-bit mode, 0x66 is accepted by both Intel and AMD. +e0: LOOPNE/LOOPNZ Jb (f64) +e1: LOOPE/LOOPZ Jb (f64) +e2: LOOP Jb (f64) +e3: JrCXZ Jb (f64) +e4: IN AL,Ib +e5: IN eAX,Ib +e6: OUT Ib,AL +e7: OUT Ib,eAX +# With 0x66 prefix in 64-bit mode, for AMD CPUs immediate offset +# in "near" jumps and calls is 16-bit. For CALL, +# push of return address is 16-bit wide, RSP is decremented by 2 +# but is not truncated to 16 bits, unlike RIP. +e8: CALL Jz (f64) +e9: JMP-near Jz (f64) +ea: JMP-far Ap (i64) +eb: JMP-short Jb (f64) +ec: IN AL,DX +ed: IN eAX,DX +ee: OUT DX,AL +ef: OUT DX,eAX +# 0xf0 - 0xff +f0: LOCK (Prefix) +f1: +f2: REPNE (Prefix) | XACQUIRE (Prefix) +f3: REP/REPE (Prefix) | XRELEASE (Prefix) +f4: HLT +f5: CMC +f6: Grp3_1 Eb (1A) +f7: Grp3_2 Ev (1A) +f8: CLC +f9: STC +fa: CLI +fb: STI +fc: CLD +fd: STD +fe: Grp4 (1A) +ff: Grp5 (1A) +EndTable + +Table: 2-byte opcode (0x0f) +Referrer: 2-byte escape +AVXcode: 1 +# 0x0f 0x00-0x0f +00: Grp6 (1A) +01: Grp7 (1A) +02: LAR Gv,Ew +03: LSL Gv,Ew +04: +05: SYSCALL (o64) +06: CLTS +07: SYSRET (o64) +08: INVD +09: WBINVD +0a: +0b: UD2 (1B) +0c: +# AMD's prefetch group. Intel supports prefetchw(/1) only. +0d: GrpP +0e: FEMMS +# 3DNow! uses the last imm byte as opcode extension. +0f: 3DNow! Pq,Qq,Ib +# 0x0f 0x10-0x1f +# NOTE: According to Intel SDM opcode map, vmovups and vmovupd has no operands +# but it actually has operands. And also, vmovss and vmovsd only accept 128bit. +# MOVSS/MOVSD has too many forms(3) on SDM. This map just shows a typical form. +# Many AVX instructions lack v1 superscript, according to Intel AVX-Prgramming +# Reference A.1 +10: vmovups Vps,Wps | vmovupd Vpd,Wpd (66) | vmovss Vx,Hx,Wss (F3),(v1) | vmovsd Vx,Hx,Wsd (F2),(v1) +11: vmovups Wps,Vps | vmovupd Wpd,Vpd (66) | vmovss Wss,Hx,Vss (F3),(v1) | vmovsd Wsd,Hx,Vsd (F2),(v1) +12: vmovlps Vq,Hq,Mq (v1) | vmovhlps Vq,Hq,Uq (v1) | vmovlpd Vq,Hq,Mq (66),(v1) | vmovsldup Vx,Wx (F3) | vmovddup Vx,Wx (F2) +13: vmovlps Mq,Vq (v1) | vmovlpd Mq,Vq (66),(v1) +14: vunpcklps Vx,Hx,Wx | vunpcklpd Vx,Hx,Wx (66) +15: vunpckhps Vx,Hx,Wx | vunpckhpd Vx,Hx,Wx (66) +16: vmovhps Vdq,Hq,Mq (v1) | vmovlhps Vdq,Hq,Uq (v1) | vmovhpd Vdq,Hq,Mq (66),(v1) | vmovshdup Vx,Wx (F3) +17: vmovhps Mq,Vq (v1) | vmovhpd Mq,Vq (66),(v1) +18: Grp16 (1A) +19: +# Intel SDM opcode map does not list MPX instructions. For now using Gv for +# bnd registers and Ev for everything else is OK because the instruction +# decoder does not use the information except as an indication that there is +# a ModR/M byte. +1a: BNDCL Gv,Ev (F3) | BNDCU Gv,Ev (F2) | BNDMOV Gv,Ev (66) | BNDLDX Gv,Ev +1b: BNDCN Gv,Ev (F2) | BNDMOV Ev,Gv (66) | BNDMK Gv,Ev (F3) | BNDSTX Ev,Gv +1c: +1d: +1e: +1f: NOP Ev +# 0x0f 0x20-0x2f +20: MOV Rd,Cd +21: MOV Rd,Dd +22: MOV Cd,Rd +23: MOV Dd,Rd +24: +25: +26: +27: +28: vmovaps Vps,Wps | vmovapd Vpd,Wpd (66) +29: vmovaps Wps,Vps | vmovapd Wpd,Vpd (66) +2a: cvtpi2ps Vps,Qpi | cvtpi2pd Vpd,Qpi (66) | vcvtsi2ss Vss,Hss,Ey (F3),(v1) | vcvtsi2sd Vsd,Hsd,Ey (F2),(v1) +2b: vmovntps Mps,Vps | vmovntpd Mpd,Vpd (66) +2c: cvttps2pi Ppi,Wps | cvttpd2pi Ppi,Wpd (66) | vcvttss2si Gy,Wss (F3),(v1) | vcvttsd2si Gy,Wsd (F2),(v1) +2d: cvtps2pi Ppi,Wps | cvtpd2pi Qpi,Wpd (66) | vcvtss2si Gy,Wss (F3),(v1) | vcvtsd2si Gy,Wsd (F2),(v1) +2e: vucomiss Vss,Wss (v1) | vucomisd Vsd,Wsd (66),(v1) +2f: vcomiss Vss,Wss (v1) | vcomisd Vsd,Wsd (66),(v1) +# 0x0f 0x30-0x3f +30: WRMSR +31: RDTSC +32: RDMSR +33: RDPMC +34: SYSENTER +35: SYSEXIT +36: +37: GETSEC +38: escape # 3-byte escape 1 +39: +3a: escape # 3-byte escape 2 +3b: +3c: +3d: +3e: +3f: +# 0x0f 0x40-0x4f +40: CMOVO Gv,Ev +41: CMOVNO Gv,Ev | kandw/q Vk,Hk,Uk | kandb/d Vk,Hk,Uk (66) +42: CMOVB/C/NAE Gv,Ev | kandnw/q Vk,Hk,Uk | kandnb/d Vk,Hk,Uk (66) +43: CMOVAE/NB/NC Gv,Ev +44: CMOVE/Z Gv,Ev | knotw/q Vk,Uk | knotb/d Vk,Uk (66) +45: CMOVNE/NZ Gv,Ev | korw/q Vk,Hk,Uk | korb/d Vk,Hk,Uk (66) +46: CMOVBE/NA Gv,Ev | kxnorw/q Vk,Hk,Uk | kxnorb/d Vk,Hk,Uk (66) +47: CMOVA/NBE Gv,Ev | kxorw/q Vk,Hk,Uk | kxorb/d Vk,Hk,Uk (66) +48: CMOVS Gv,Ev +49: CMOVNS Gv,Ev +4a: CMOVP/PE Gv,Ev | kaddw/q Vk,Hk,Uk | kaddb/d Vk,Hk,Uk (66) +4b: CMOVNP/PO Gv,Ev | kunpckbw Vk,Hk,Uk (66) | kunpckwd/dq Vk,Hk,Uk +4c: CMOVL/NGE Gv,Ev +4d: CMOVNL/GE Gv,Ev +4e: CMOVLE/NG Gv,Ev +4f: CMOVNLE/G Gv,Ev +# 0x0f 0x50-0x5f +50: vmovmskps Gy,Ups | vmovmskpd Gy,Upd (66) +51: vsqrtps Vps,Wps | vsqrtpd Vpd,Wpd (66) | vsqrtss Vss,Hss,Wss (F3),(v1) | vsqrtsd Vsd,Hsd,Wsd (F2),(v1) +52: vrsqrtps Vps,Wps | vrsqrtss Vss,Hss,Wss (F3),(v1) +53: vrcpps Vps,Wps | vrcpss Vss,Hss,Wss (F3),(v1) +54: vandps Vps,Hps,Wps | vandpd Vpd,Hpd,Wpd (66) +55: vandnps Vps,Hps,Wps | vandnpd Vpd,Hpd,Wpd (66) +56: vorps Vps,Hps,Wps | vorpd Vpd,Hpd,Wpd (66) +57: vxorps Vps,Hps,Wps | vxorpd Vpd,Hpd,Wpd (66) +58: vaddps Vps,Hps,Wps | vaddpd Vpd,Hpd,Wpd (66) | vaddss Vss,Hss,Wss (F3),(v1) | vaddsd Vsd,Hsd,Wsd (F2),(v1) +59: vmulps Vps,Hps,Wps | vmulpd Vpd,Hpd,Wpd (66) | vmulss Vss,Hss,Wss (F3),(v1) | vmulsd Vsd,Hsd,Wsd (F2),(v1) +5a: vcvtps2pd Vpd,Wps | vcvtpd2ps Vps,Wpd (66) | vcvtss2sd Vsd,Hx,Wss (F3),(v1) | vcvtsd2ss Vss,Hx,Wsd (F2),(v1) +5b: vcvtdq2ps Vps,Wdq | vcvtqq2ps Vps,Wqq (evo) | vcvtps2dq Vdq,Wps (66) | vcvttps2dq Vdq,Wps (F3) +5c: vsubps Vps,Hps,Wps | vsubpd Vpd,Hpd,Wpd (66) | vsubss Vss,Hss,Wss (F3),(v1) | vsubsd Vsd,Hsd,Wsd (F2),(v1) +5d: vminps Vps,Hps,Wps | vminpd Vpd,Hpd,Wpd (66) | vminss Vss,Hss,Wss (F3),(v1) | vminsd Vsd,Hsd,Wsd (F2),(v1) +5e: vdivps Vps,Hps,Wps | vdivpd Vpd,Hpd,Wpd (66) | vdivss Vss,Hss,Wss (F3),(v1) | vdivsd Vsd,Hsd,Wsd (F2),(v1) +5f: vmaxps Vps,Hps,Wps | vmaxpd Vpd,Hpd,Wpd (66) | vmaxss Vss,Hss,Wss (F3),(v1) | vmaxsd Vsd,Hsd,Wsd (F2),(v1) +# 0x0f 0x60-0x6f +60: punpcklbw Pq,Qd | vpunpcklbw Vx,Hx,Wx (66),(v1) +61: punpcklwd Pq,Qd | vpunpcklwd Vx,Hx,Wx (66),(v1) +62: punpckldq Pq,Qd | vpunpckldq Vx,Hx,Wx (66),(v1) +63: packsswb Pq,Qq | vpacksswb Vx,Hx,Wx (66),(v1) +64: pcmpgtb Pq,Qq | vpcmpgtb Vx,Hx,Wx (66),(v1) +65: pcmpgtw Pq,Qq | vpcmpgtw Vx,Hx,Wx (66),(v1) +66: pcmpgtd Pq,Qq | vpcmpgtd Vx,Hx,Wx (66),(v1) +67: packuswb Pq,Qq | vpackuswb Vx,Hx,Wx (66),(v1) +68: punpckhbw Pq,Qd | vpunpckhbw Vx,Hx,Wx (66),(v1) +69: punpckhwd Pq,Qd | vpunpckhwd Vx,Hx,Wx (66),(v1) +6a: punpckhdq Pq,Qd | vpunpckhdq Vx,Hx,Wx (66),(v1) +6b: packssdw Pq,Qd | vpackssdw Vx,Hx,Wx (66),(v1) +6c: vpunpcklqdq Vx,Hx,Wx (66),(v1) +6d: vpunpckhqdq Vx,Hx,Wx (66),(v1) +6e: movd/q Pd,Ey | vmovd/q Vy,Ey (66),(v1) +6f: movq Pq,Qq | vmovdqa Vx,Wx (66) | vmovdqa32/64 Vx,Wx (66),(evo) | vmovdqu Vx,Wx (F3) | vmovdqu32/64 Vx,Wx (F3),(evo) | vmovdqu8/16 Vx,Wx (F2),(ev) +# 0x0f 0x70-0x7f +70: pshufw Pq,Qq,Ib | vpshufd Vx,Wx,Ib (66),(v1) | vpshufhw Vx,Wx,Ib (F3),(v1) | vpshuflw Vx,Wx,Ib (F2),(v1) +71: Grp12 (1A) +72: Grp13 (1A) +73: Grp14 (1A) +74: pcmpeqb Pq,Qq | vpcmpeqb Vx,Hx,Wx (66),(v1) +75: pcmpeqw Pq,Qq | vpcmpeqw Vx,Hx,Wx (66),(v1) +76: pcmpeqd Pq,Qq | vpcmpeqd Vx,Hx,Wx (66),(v1) +# Note: Remove (v), because vzeroall and vzeroupper becomes emms without VEX. +77: emms | vzeroupper | vzeroall +78: VMREAD Ey,Gy | vcvttps2udq/pd2udq Vx,Wpd (evo) | vcvttsd2usi Gv,Wx (F2),(ev) | vcvttss2usi Gv,Wx (F3),(ev) | vcvttps2uqq/pd2uqq Vx,Wx (66),(ev) +79: VMWRITE Gy,Ey | vcvtps2udq/pd2udq Vx,Wpd (evo) | vcvtsd2usi Gv,Wx (F2),(ev) | vcvtss2usi Gv,Wx (F3),(ev) | vcvtps2uqq/pd2uqq Vx,Wx (66),(ev) +7a: vcvtudq2pd/uqq2pd Vpd,Wx (F3),(ev) | vcvtudq2ps/uqq2ps Vpd,Wx (F2),(ev) | vcvttps2qq/pd2qq Vx,Wx (66),(ev) +7b: vcvtusi2sd Vpd,Hpd,Ev (F2),(ev) | vcvtusi2ss Vps,Hps,Ev (F3),(ev) | vcvtps2qq/pd2qq Vx,Wx (66),(ev) +7c: vhaddpd Vpd,Hpd,Wpd (66) | vhaddps Vps,Hps,Wps (F2) +7d: vhsubpd Vpd,Hpd,Wpd (66) | vhsubps Vps,Hps,Wps (F2) +7e: movd/q Ey,Pd | vmovd/q Ey,Vy (66),(v1) | vmovq Vq,Wq (F3),(v1) +7f: movq Qq,Pq | vmovdqa Wx,Vx (66) | vmovdqa32/64 Wx,Vx (66),(evo) | vmovdqu Wx,Vx (F3) | vmovdqu32/64 Wx,Vx (F3),(evo) | vmovdqu8/16 Wx,Vx (F2),(ev) +# 0x0f 0x80-0x8f +# Note: "forced64" is Intel CPU behavior (see comment about CALL insn). +80: JO Jz (f64) +81: JNO Jz (f64) +82: JB/JC/JNAE Jz (f64) +83: JAE/JNB/JNC Jz (f64) +84: JE/JZ Jz (f64) +85: JNE/JNZ Jz (f64) +86: JBE/JNA Jz (f64) +87: JA/JNBE Jz (f64) +88: JS Jz (f64) +89: JNS Jz (f64) +8a: JP/JPE Jz (f64) +8b: JNP/JPO Jz (f64) +8c: JL/JNGE Jz (f64) +8d: JNL/JGE Jz (f64) +8e: JLE/JNG Jz (f64) +8f: JNLE/JG Jz (f64) +# 0x0f 0x90-0x9f +90: SETO Eb | kmovw/q Vk,Wk | kmovb/d Vk,Wk (66) +91: SETNO Eb | kmovw/q Mv,Vk | kmovb/d Mv,Vk (66) +92: SETB/C/NAE Eb | kmovw Vk,Rv | kmovb Vk,Rv (66) | kmovq/d Vk,Rv (F2) +93: SETAE/NB/NC Eb | kmovw Gv,Uk | kmovb Gv,Uk (66) | kmovq/d Gv,Uk (F2) +94: SETE/Z Eb +95: SETNE/NZ Eb +96: SETBE/NA Eb +97: SETA/NBE Eb +98: SETS Eb | kortestw/q Vk,Uk | kortestb/d Vk,Uk (66) +99: SETNS Eb | ktestw/q Vk,Uk | ktestb/d Vk,Uk (66) +9a: SETP/PE Eb +9b: SETNP/PO Eb +9c: SETL/NGE Eb +9d: SETNL/GE Eb +9e: SETLE/NG Eb +9f: SETNLE/G Eb +# 0x0f 0xa0-0xaf +a0: PUSH FS (d64) +a1: POP FS (d64) +a2: CPUID +a3: BT Ev,Gv +a4: SHLD Ev,Gv,Ib +a5: SHLD Ev,Gv,CL +a6: GrpPDLK +a7: GrpRNG +a8: PUSH GS (d64) +a9: POP GS (d64) +aa: RSM +ab: BTS Ev,Gv +ac: SHRD Ev,Gv,Ib +ad: SHRD Ev,Gv,CL +ae: Grp15 (1A),(1C) +af: IMUL Gv,Ev +# 0x0f 0xb0-0xbf +b0: CMPXCHG Eb,Gb +b1: CMPXCHG Ev,Gv +b2: LSS Gv,Mp +b3: BTR Ev,Gv +b4: LFS Gv,Mp +b5: LGS Gv,Mp +b6: MOVZX Gv,Eb +b7: MOVZX Gv,Ew +b8: JMPE (!F3) | POPCNT Gv,Ev (F3) +b9: Grp10 (1A) +ba: Grp8 Ev,Ib (1A) +bb: BTC Ev,Gv +bc: BSF Gv,Ev (!F3) | TZCNT Gv,Ev (F3) +bd: BSR Gv,Ev (!F3) | LZCNT Gv,Ev (F3) +be: MOVSX Gv,Eb +bf: MOVSX Gv,Ew +# 0x0f 0xc0-0xcf +c0: XADD Eb,Gb +c1: XADD Ev,Gv +c2: vcmpps Vps,Hps,Wps,Ib | vcmppd Vpd,Hpd,Wpd,Ib (66) | vcmpss Vss,Hss,Wss,Ib (F3),(v1) | vcmpsd Vsd,Hsd,Wsd,Ib (F2),(v1) +c3: movnti My,Gy +c4: pinsrw Pq,Ry/Mw,Ib | vpinsrw Vdq,Hdq,Ry/Mw,Ib (66),(v1) +c5: pextrw Gd,Nq,Ib | vpextrw Gd,Udq,Ib (66),(v1) +c6: vshufps Vps,Hps,Wps,Ib | vshufpd Vpd,Hpd,Wpd,Ib (66) +c7: Grp9 (1A) +c8: BSWAP RAX/EAX/R8/R8D +c9: BSWAP RCX/ECX/R9/R9D +ca: BSWAP RDX/EDX/R10/R10D +cb: BSWAP RBX/EBX/R11/R11D +cc: BSWAP RSP/ESP/R12/R12D +cd: BSWAP RBP/EBP/R13/R13D +ce: BSWAP RSI/ESI/R14/R14D +cf: BSWAP RDI/EDI/R15/R15D +# 0x0f 0xd0-0xdf +d0: vaddsubpd Vpd,Hpd,Wpd (66) | vaddsubps Vps,Hps,Wps (F2) +d1: psrlw Pq,Qq | vpsrlw Vx,Hx,Wx (66),(v1) +d2: psrld Pq,Qq | vpsrld Vx,Hx,Wx (66),(v1) +d3: psrlq Pq,Qq | vpsrlq Vx,Hx,Wx (66),(v1) +d4: paddq Pq,Qq | vpaddq Vx,Hx,Wx (66),(v1) +d5: pmullw Pq,Qq | vpmullw Vx,Hx,Wx (66),(v1) +d6: vmovq Wq,Vq (66),(v1) | movq2dq Vdq,Nq (F3) | movdq2q Pq,Uq (F2) +d7: pmovmskb Gd,Nq | vpmovmskb Gd,Ux (66),(v1) +d8: psubusb Pq,Qq | vpsubusb Vx,Hx,Wx (66),(v1) +d9: psubusw Pq,Qq | vpsubusw Vx,Hx,Wx (66),(v1) +da: pminub Pq,Qq | vpminub Vx,Hx,Wx (66),(v1) +db: pand Pq,Qq | vpand Vx,Hx,Wx (66),(v1) | vpandd/q Vx,Hx,Wx (66),(evo) +dc: paddusb Pq,Qq | vpaddusb Vx,Hx,Wx (66),(v1) +dd: paddusw Pq,Qq | vpaddusw Vx,Hx,Wx (66),(v1) +de: pmaxub Pq,Qq | vpmaxub Vx,Hx,Wx (66),(v1) +df: pandn Pq,Qq | vpandn Vx,Hx,Wx (66),(v1) | vpandnd/q Vx,Hx,Wx (66),(evo) +# 0x0f 0xe0-0xef +e0: pavgb Pq,Qq | vpavgb Vx,Hx,Wx (66),(v1) +e1: psraw Pq,Qq | vpsraw Vx,Hx,Wx (66),(v1) +e2: psrad Pq,Qq | vpsrad Vx,Hx,Wx (66),(v1) +e3: pavgw Pq,Qq | vpavgw Vx,Hx,Wx (66),(v1) +e4: pmulhuw Pq,Qq | vpmulhuw Vx,Hx,Wx (66),(v1) +e5: pmulhw Pq,Qq | vpmulhw Vx,Hx,Wx (66),(v1) +e6: vcvttpd2dq Vx,Wpd (66) | vcvtdq2pd Vx,Wdq (F3) | vcvtdq2pd/qq2pd Vx,Wdq (F3),(evo) | vcvtpd2dq Vx,Wpd (F2) +e7: movntq Mq,Pq | vmovntdq Mx,Vx (66) +e8: psubsb Pq,Qq | vpsubsb Vx,Hx,Wx (66),(v1) +e9: psubsw Pq,Qq | vpsubsw Vx,Hx,Wx (66),(v1) +ea: pminsw Pq,Qq | vpminsw Vx,Hx,Wx (66),(v1) +eb: por Pq,Qq | vpor Vx,Hx,Wx (66),(v1) | vpord/q Vx,Hx,Wx (66),(evo) +ec: paddsb Pq,Qq | vpaddsb Vx,Hx,Wx (66),(v1) +ed: paddsw Pq,Qq | vpaddsw Vx,Hx,Wx (66),(v1) +ee: pmaxsw Pq,Qq | vpmaxsw Vx,Hx,Wx (66),(v1) +ef: pxor Pq,Qq | vpxor Vx,Hx,Wx (66),(v1) | vpxord/q Vx,Hx,Wx (66),(evo) +# 0x0f 0xf0-0xff +f0: vlddqu Vx,Mx (F2) +f1: psllw Pq,Qq | vpsllw Vx,Hx,Wx (66),(v1) +f2: pslld Pq,Qq | vpslld Vx,Hx,Wx (66),(v1) +f3: psllq Pq,Qq | vpsllq Vx,Hx,Wx (66),(v1) +f4: pmuludq Pq,Qq | vpmuludq Vx,Hx,Wx (66),(v1) +f5: pmaddwd Pq,Qq | vpmaddwd Vx,Hx,Wx (66),(v1) +f6: psadbw Pq,Qq | vpsadbw Vx,Hx,Wx (66),(v1) +f7: maskmovq Pq,Nq | vmaskmovdqu Vx,Ux (66),(v1) +f8: psubb Pq,Qq | vpsubb Vx,Hx,Wx (66),(v1) +f9: psubw Pq,Qq | vpsubw Vx,Hx,Wx (66),(v1) +fa: psubd Pq,Qq | vpsubd Vx,Hx,Wx (66),(v1) +fb: psubq Pq,Qq | vpsubq Vx,Hx,Wx (66),(v1) +fc: paddb Pq,Qq | vpaddb Vx,Hx,Wx (66),(v1) +fd: paddw Pq,Qq | vpaddw Vx,Hx,Wx (66),(v1) +fe: paddd Pq,Qq | vpaddd Vx,Hx,Wx (66),(v1) +ff: UD0 +EndTable + +Table: 3-byte opcode 1 (0x0f 0x38) +Referrer: 3-byte escape 1 +AVXcode: 2 +# 0x0f 0x38 0x00-0x0f +00: pshufb Pq,Qq | vpshufb Vx,Hx,Wx (66),(v1) +01: phaddw Pq,Qq | vphaddw Vx,Hx,Wx (66),(v1) +02: phaddd Pq,Qq | vphaddd Vx,Hx,Wx (66),(v1) +03: phaddsw Pq,Qq | vphaddsw Vx,Hx,Wx (66),(v1) +04: pmaddubsw Pq,Qq | vpmaddubsw Vx,Hx,Wx (66),(v1) +05: phsubw Pq,Qq | vphsubw Vx,Hx,Wx (66),(v1) +06: phsubd Pq,Qq | vphsubd Vx,Hx,Wx (66),(v1) +07: phsubsw Pq,Qq | vphsubsw Vx,Hx,Wx (66),(v1) +08: psignb Pq,Qq | vpsignb Vx,Hx,Wx (66),(v1) +09: psignw Pq,Qq | vpsignw Vx,Hx,Wx (66),(v1) +0a: psignd Pq,Qq | vpsignd Vx,Hx,Wx (66),(v1) +0b: pmulhrsw Pq,Qq | vpmulhrsw Vx,Hx,Wx (66),(v1) +0c: vpermilps Vx,Hx,Wx (66),(v) +0d: vpermilpd Vx,Hx,Wx (66),(v) +0e: vtestps Vx,Wx (66),(v) +0f: vtestpd Vx,Wx (66),(v) +# 0x0f 0x38 0x10-0x1f +10: pblendvb Vdq,Wdq (66) | vpsrlvw Vx,Hx,Wx (66),(evo) | vpmovuswb Wx,Vx (F3),(ev) +11: vpmovusdb Wx,Vd (F3),(ev) | vpsravw Vx,Hx,Wx (66),(ev) +12: vpmovusqb Wx,Vq (F3),(ev) | vpsllvw Vx,Hx,Wx (66),(ev) +13: vcvtph2ps Vx,Wx (66),(v) | vpmovusdw Wx,Vd (F3),(ev) +14: blendvps Vdq,Wdq (66) | vpmovusqw Wx,Vq (F3),(ev) | vprorvd/q Vx,Hx,Wx (66),(evo) +15: blendvpd Vdq,Wdq (66) | vpmovusqd Wx,Vq (F3),(ev) | vprolvd/q Vx,Hx,Wx (66),(evo) +16: vpermps Vqq,Hqq,Wqq (66),(v) | vpermps/d Vqq,Hqq,Wqq (66),(evo) +17: vptest Vx,Wx (66) +18: vbroadcastss Vx,Wd (66),(v) +19: vbroadcastsd Vqq,Wq (66),(v) | vbroadcastf32x2 Vqq,Wq (66),(evo) +1a: vbroadcastf128 Vqq,Mdq (66),(v) | vbroadcastf32x4/64x2 Vqq,Wq (66),(evo) +1b: vbroadcastf32x8/64x4 Vqq,Mdq (66),(ev) +1c: pabsb Pq,Qq | vpabsb Vx,Wx (66),(v1) +1d: pabsw Pq,Qq | vpabsw Vx,Wx (66),(v1) +1e: pabsd Pq,Qq | vpabsd Vx,Wx (66),(v1) +1f: vpabsq Vx,Wx (66),(ev) +# 0x0f 0x38 0x20-0x2f +20: vpmovsxbw Vx,Ux/Mq (66),(v1) | vpmovswb Wx,Vx (F3),(ev) +21: vpmovsxbd Vx,Ux/Md (66),(v1) | vpmovsdb Wx,Vd (F3),(ev) +22: vpmovsxbq Vx,Ux/Mw (66),(v1) | vpmovsqb Wx,Vq (F3),(ev) +23: vpmovsxwd Vx,Ux/Mq (66),(v1) | vpmovsdw Wx,Vd (F3),(ev) +24: vpmovsxwq Vx,Ux/Md (66),(v1) | vpmovsqw Wx,Vq (F3),(ev) +25: vpmovsxdq Vx,Ux/Mq (66),(v1) | vpmovsqd Wx,Vq (F3),(ev) +26: vptestmb/w Vk,Hx,Wx (66),(ev) | vptestnmb/w Vk,Hx,Wx (F3),(ev) +27: vptestmd/q Vk,Hx,Wx (66),(ev) | vptestnmd/q Vk,Hx,Wx (F3),(ev) +28: vpmuldq Vx,Hx,Wx (66),(v1) | vpmovm2b/w Vx,Uk (F3),(ev) +29: vpcmpeqq Vx,Hx,Wx (66),(v1) | vpmovb2m/w2m Vk,Ux (F3),(ev) +2a: vmovntdqa Vx,Mx (66),(v1) | vpbroadcastmb2q Vx,Uk (F3),(ev) +2b: vpackusdw Vx,Hx,Wx (66),(v1) +2c: vmaskmovps Vx,Hx,Mx (66),(v) | vscalefps/d Vx,Hx,Wx (66),(evo) +2d: vmaskmovpd Vx,Hx,Mx (66),(v) | vscalefss/d Vx,Hx,Wx (66),(evo) +2e: vmaskmovps Mx,Hx,Vx (66),(v) +2f: vmaskmovpd Mx,Hx,Vx (66),(v) +# 0x0f 0x38 0x30-0x3f +30: vpmovzxbw Vx,Ux/Mq (66),(v1) | vpmovwb Wx,Vx (F3),(ev) +31: vpmovzxbd Vx,Ux/Md (66),(v1) | vpmovdb Wx,Vd (F3),(ev) +32: vpmovzxbq Vx,Ux/Mw (66),(v1) | vpmovqb Wx,Vq (F3),(ev) +33: vpmovzxwd Vx,Ux/Mq (66),(v1) | vpmovdw Wx,Vd (F3),(ev) +34: vpmovzxwq Vx,Ux/Md (66),(v1) | vpmovqw Wx,Vq (F3),(ev) +35: vpmovzxdq Vx,Ux/Mq (66),(v1) | vpmovqd Wx,Vq (F3),(ev) +36: vpermd Vqq,Hqq,Wqq (66),(v) | vpermd/q Vqq,Hqq,Wqq (66),(evo) +37: vpcmpgtq Vx,Hx,Wx (66),(v1) +38: vpminsb Vx,Hx,Wx (66),(v1) | vpmovm2d/q Vx,Uk (F3),(ev) +39: vpminsd Vx,Hx,Wx (66),(v1) | vpminsd/q Vx,Hx,Wx (66),(evo) | vpmovd2m/q2m Vk,Ux (F3),(ev) +3a: vpminuw Vx,Hx,Wx (66),(v1) | vpbroadcastmw2d Vx,Uk (F3),(ev) +3b: vpminud Vx,Hx,Wx (66),(v1) | vpminud/q Vx,Hx,Wx (66),(evo) +3c: vpmaxsb Vx,Hx,Wx (66),(v1) +3d: vpmaxsd Vx,Hx,Wx (66),(v1) | vpmaxsd/q Vx,Hx,Wx (66),(evo) +3e: vpmaxuw Vx,Hx,Wx (66),(v1) +3f: vpmaxud Vx,Hx,Wx (66),(v1) | vpmaxud/q Vx,Hx,Wx (66),(evo) +# 0x0f 0x38 0x40-0x8f +40: vpmulld Vx,Hx,Wx (66),(v1) | vpmulld/q Vx,Hx,Wx (66),(evo) +41: vphminposuw Vdq,Wdq (66),(v1) +42: vgetexpps/d Vx,Wx (66),(ev) +43: vgetexpss/d Vx,Hx,Wx (66),(ev) +44: vplzcntd/q Vx,Wx (66),(ev) +45: vpsrlvd/q Vx,Hx,Wx (66),(v) +46: vpsravd Vx,Hx,Wx (66),(v) | vpsravd/q Vx,Hx,Wx (66),(evo) +47: vpsllvd/q Vx,Hx,Wx (66),(v) +# Skip 0x48-0x4b +4c: vrcp14ps/d Vpd,Wpd (66),(ev) +4d: vrcp14ss/d Vsd,Hpd,Wsd (66),(ev) +4e: vrsqrt14ps/d Vpd,Wpd (66),(ev) +4f: vrsqrt14ss/d Vsd,Hsd,Wsd (66),(ev) +# Skip 0x50-0x57 +58: vpbroadcastd Vx,Wx (66),(v) +59: vpbroadcastq Vx,Wx (66),(v) | vbroadcasti32x2 Vx,Wx (66),(evo) +5a: vbroadcasti128 Vqq,Mdq (66),(v) | vbroadcasti32x4/64x2 Vx,Wx (66),(evo) +5b: vbroadcasti32x8/64x4 Vqq,Mdq (66),(ev) +# Skip 0x5c-0x63 +64: vpblendmd/q Vx,Hx,Wx (66),(ev) +65: vblendmps/d Vx,Hx,Wx (66),(ev) +66: vpblendmb/w Vx,Hx,Wx (66),(ev) +# Skip 0x67-0x74 +75: vpermi2b/w Vx,Hx,Wx (66),(ev) +76: vpermi2d/q Vx,Hx,Wx (66),(ev) +77: vpermi2ps/d Vx,Hx,Wx (66),(ev) +78: vpbroadcastb Vx,Wx (66),(v) +79: vpbroadcastw Vx,Wx (66),(v) +7a: vpbroadcastb Vx,Rv (66),(ev) +7b: vpbroadcastw Vx,Rv (66),(ev) +7c: vpbroadcastd/q Vx,Rv (66),(ev) +7d: vpermt2b/w Vx,Hx,Wx (66),(ev) +7e: vpermt2d/q Vx,Hx,Wx (66),(ev) +7f: vpermt2ps/d Vx,Hx,Wx (66),(ev) +80: INVEPT Gy,Mdq (66) +81: INVVPID Gy,Mdq (66) +82: INVPCID Gy,Mdq (66) +83: vpmultishiftqb Vx,Hx,Wx (66),(ev) +88: vexpandps/d Vpd,Wpd (66),(ev) +89: vpexpandd/q Vx,Wx (66),(ev) +8a: vcompressps/d Wx,Vx (66),(ev) +8b: vpcompressd/q Wx,Vx (66),(ev) +8c: vpmaskmovd/q Vx,Hx,Mx (66),(v) +8d: vpermb/w Vx,Hx,Wx (66),(ev) +8e: vpmaskmovd/q Mx,Vx,Hx (66),(v) +# 0x0f 0x38 0x90-0xbf (FMA) +90: vgatherdd/q Vx,Hx,Wx (66),(v) | vpgatherdd/q Vx,Wx (66),(evo) +91: vgatherqd/q Vx,Hx,Wx (66),(v) | vpgatherqd/q Vx,Wx (66),(evo) +92: vgatherdps/d Vx,Hx,Wx (66),(v) +93: vgatherqps/d Vx,Hx,Wx (66),(v) +94: +95: +96: vfmaddsub132ps/d Vx,Hx,Wx (66),(v) +97: vfmsubadd132ps/d Vx,Hx,Wx (66),(v) +98: vfmadd132ps/d Vx,Hx,Wx (66),(v) +99: vfmadd132ss/d Vx,Hx,Wx (66),(v),(v1) +9a: vfmsub132ps/d Vx,Hx,Wx (66),(v) +9b: vfmsub132ss/d Vx,Hx,Wx (66),(v),(v1) +9c: vfnmadd132ps/d Vx,Hx,Wx (66),(v) +9d: vfnmadd132ss/d Vx,Hx,Wx (66),(v),(v1) +9e: vfnmsub132ps/d Vx,Hx,Wx (66),(v) +9f: vfnmsub132ss/d Vx,Hx,Wx (66),(v),(v1) +a0: vpscatterdd/q Wx,Vx (66),(ev) +a1: vpscatterqd/q Wx,Vx (66),(ev) +a2: vscatterdps/d Wx,Vx (66),(ev) +a3: vscatterqps/d Wx,Vx (66),(ev) +a6: vfmaddsub213ps/d Vx,Hx,Wx (66),(v) +a7: vfmsubadd213ps/d Vx,Hx,Wx (66),(v) +a8: vfmadd213ps/d Vx,Hx,Wx (66),(v) +a9: vfmadd213ss/d Vx,Hx,Wx (66),(v),(v1) +aa: vfmsub213ps/d Vx,Hx,Wx (66),(v) +ab: vfmsub213ss/d Vx,Hx,Wx (66),(v),(v1) +ac: vfnmadd213ps/d Vx,Hx,Wx (66),(v) +ad: vfnmadd213ss/d Vx,Hx,Wx (66),(v),(v1) +ae: vfnmsub213ps/d Vx,Hx,Wx (66),(v) +af: vfnmsub213ss/d Vx,Hx,Wx (66),(v),(v1) +b4: vpmadd52luq Vx,Hx,Wx (66),(ev) +b5: vpmadd52huq Vx,Hx,Wx (66),(ev) +b6: vfmaddsub231ps/d Vx,Hx,Wx (66),(v) +b7: vfmsubadd231ps/d Vx,Hx,Wx (66),(v) +b8: vfmadd231ps/d Vx,Hx,Wx (66),(v) +b9: vfmadd231ss/d Vx,Hx,Wx (66),(v),(v1) +ba: vfmsub231ps/d Vx,Hx,Wx (66),(v) +bb: vfmsub231ss/d Vx,Hx,Wx (66),(v),(v1) +bc: vfnmadd231ps/d Vx,Hx,Wx (66),(v) +bd: vfnmadd231ss/d Vx,Hx,Wx (66),(v),(v1) +be: vfnmsub231ps/d Vx,Hx,Wx (66),(v) +bf: vfnmsub231ss/d Vx,Hx,Wx (66),(v),(v1) +# 0x0f 0x38 0xc0-0xff +c4: vpconflictd/q Vx,Wx (66),(ev) +c6: Grp18 (1A) +c7: Grp19 (1A) +c8: sha1nexte Vdq,Wdq | vexp2ps/d Vx,Wx (66),(ev) +c9: sha1msg1 Vdq,Wdq +ca: sha1msg2 Vdq,Wdq | vrcp28ps/d Vx,Wx (66),(ev) +cb: sha256rnds2 Vdq,Wdq | vrcp28ss/d Vx,Hx,Wx (66),(ev) +cc: sha256msg1 Vdq,Wdq | vrsqrt28ps/d Vx,Wx (66),(ev) +cd: sha256msg2 Vdq,Wdq | vrsqrt28ss/d Vx,Hx,Wx (66),(ev) +db: VAESIMC Vdq,Wdq (66),(v1) +dc: VAESENC Vdq,Hdq,Wdq (66),(v1) +dd: VAESENCLAST Vdq,Hdq,Wdq (66),(v1) +de: VAESDEC Vdq,Hdq,Wdq (66),(v1) +df: VAESDECLAST Vdq,Hdq,Wdq (66),(v1) +f0: MOVBE Gy,My | MOVBE Gw,Mw (66) | CRC32 Gd,Eb (F2) | CRC32 Gd,Eb (66&F2) +f1: MOVBE My,Gy | MOVBE Mw,Gw (66) | CRC32 Gd,Ey (F2) | CRC32 Gd,Ew (66&F2) +f2: ANDN Gy,By,Ey (v) +f3: Grp17 (1A) +f5: BZHI Gy,Ey,By (v) | PEXT Gy,By,Ey (F3),(v) | PDEP Gy,By,Ey (F2),(v) +f6: ADCX Gy,Ey (66) | ADOX Gy,Ey (F3) | MULX By,Gy,rDX,Ey (F2),(v) +f7: BEXTR Gy,Ey,By (v) | SHLX Gy,Ey,By (66),(v) | SARX Gy,Ey,By (F3),(v) | SHRX Gy,Ey,By (F2),(v) +EndTable + +Table: 3-byte opcode 2 (0x0f 0x3a) +Referrer: 3-byte escape 2 +AVXcode: 3 +# 0x0f 0x3a 0x00-0xff +00: vpermq Vqq,Wqq,Ib (66),(v) +01: vpermpd Vqq,Wqq,Ib (66),(v) +02: vpblendd Vx,Hx,Wx,Ib (66),(v) +03: valignd/q Vx,Hx,Wx,Ib (66),(ev) +04: vpermilps Vx,Wx,Ib (66),(v) +05: vpermilpd Vx,Wx,Ib (66),(v) +06: vperm2f128 Vqq,Hqq,Wqq,Ib (66),(v) +07: +08: vroundps Vx,Wx,Ib (66) | vrndscaleps Vx,Wx,Ib (66),(evo) +09: vroundpd Vx,Wx,Ib (66) | vrndscalepd Vx,Wx,Ib (66),(evo) +0a: vroundss Vss,Wss,Ib (66),(v1) | vrndscaless Vx,Hx,Wx,Ib (66),(evo) +0b: vroundsd Vsd,Wsd,Ib (66),(v1) | vrndscalesd Vx,Hx,Wx,Ib (66),(evo) +0c: vblendps Vx,Hx,Wx,Ib (66) +0d: vblendpd Vx,Hx,Wx,Ib (66) +0e: vpblendw Vx,Hx,Wx,Ib (66),(v1) +0f: palignr Pq,Qq,Ib | vpalignr Vx,Hx,Wx,Ib (66),(v1) +14: vpextrb Rd/Mb,Vdq,Ib (66),(v1) +15: vpextrw Rd/Mw,Vdq,Ib (66),(v1) +16: vpextrd/q Ey,Vdq,Ib (66),(v1) +17: vextractps Ed,Vdq,Ib (66),(v1) +18: vinsertf128 Vqq,Hqq,Wqq,Ib (66),(v) | vinsertf32x4/64x2 Vqq,Hqq,Wqq,Ib (66),(evo) +19: vextractf128 Wdq,Vqq,Ib (66),(v) | vextractf32x4/64x2 Wdq,Vqq,Ib (66),(evo) +1a: vinsertf32x8/64x4 Vqq,Hqq,Wqq,Ib (66),(ev) +1b: vextractf32x8/64x4 Wdq,Vqq,Ib (66),(ev) +1d: vcvtps2ph Wx,Vx,Ib (66),(v) +1e: vpcmpud/q Vk,Hd,Wd,Ib (66),(ev) +1f: vpcmpd/q Vk,Hd,Wd,Ib (66),(ev) +20: vpinsrb Vdq,Hdq,Ry/Mb,Ib (66),(v1) +21: vinsertps Vdq,Hdq,Udq/Md,Ib (66),(v1) +22: vpinsrd/q Vdq,Hdq,Ey,Ib (66),(v1) +23: vshuff32x4/64x2 Vx,Hx,Wx,Ib (66),(ev) +25: vpternlogd/q Vx,Hx,Wx,Ib (66),(ev) +26: vgetmantps/d Vx,Wx,Ib (66),(ev) +27: vgetmantss/d Vx,Hx,Wx,Ib (66),(ev) +30: kshiftrb/w Vk,Uk,Ib (66),(v) +31: kshiftrd/q Vk,Uk,Ib (66),(v) +32: kshiftlb/w Vk,Uk,Ib (66),(v) +33: kshiftld/q Vk,Uk,Ib (66),(v) +38: vinserti128 Vqq,Hqq,Wqq,Ib (66),(v) | vinserti32x4/64x2 Vqq,Hqq,Wqq,Ib (66),(evo) +39: vextracti128 Wdq,Vqq,Ib (66),(v) | vextracti32x4/64x2 Wdq,Vqq,Ib (66),(evo) +3a: vinserti32x8/64x4 Vqq,Hqq,Wqq,Ib (66),(ev) +3b: vextracti32x8/64x4 Wdq,Vqq,Ib (66),(ev) +3e: vpcmpub/w Vk,Hk,Wx,Ib (66),(ev) +3f: vpcmpb/w Vk,Hk,Wx,Ib (66),(ev) +40: vdpps Vx,Hx,Wx,Ib (66) +41: vdppd Vdq,Hdq,Wdq,Ib (66),(v1) +42: vmpsadbw Vx,Hx,Wx,Ib (66),(v1) | vdbpsadbw Vx,Hx,Wx,Ib (66),(evo) +43: vshufi32x4/64x2 Vx,Hx,Wx,Ib (66),(ev) +44: vpclmulqdq Vdq,Hdq,Wdq,Ib (66),(v1) +46: vperm2i128 Vqq,Hqq,Wqq,Ib (66),(v) +4a: vblendvps Vx,Hx,Wx,Lx (66),(v) +4b: vblendvpd Vx,Hx,Wx,Lx (66),(v) +4c: vpblendvb Vx,Hx,Wx,Lx (66),(v1) +50: vrangeps/d Vx,Hx,Wx,Ib (66),(ev) +51: vrangess/d Vx,Hx,Wx,Ib (66),(ev) +54: vfixupimmps/d Vx,Hx,Wx,Ib (66),(ev) +55: vfixupimmss/d Vx,Hx,Wx,Ib (66),(ev) +56: vreduceps/d Vx,Wx,Ib (66),(ev) +57: vreducess/d Vx,Hx,Wx,Ib (66),(ev) +60: vpcmpestrm Vdq,Wdq,Ib (66),(v1) +61: vpcmpestri Vdq,Wdq,Ib (66),(v1) +62: vpcmpistrm Vdq,Wdq,Ib (66),(v1) +63: vpcmpistri Vdq,Wdq,Ib (66),(v1) +66: vfpclassps/d Vk,Wx,Ib (66),(ev) +67: vfpclassss/d Vk,Wx,Ib (66),(ev) +cc: sha1rnds4 Vdq,Wdq,Ib +df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1) +f0: RORX Gy,Ey,Ib (F2),(v) +EndTable + +GrpTable: Grp1 +0: ADD +1: OR +2: ADC +3: SBB +4: AND +5: SUB +6: XOR +7: CMP +EndTable + +GrpTable: Grp1A +0: POP +EndTable + +GrpTable: Grp2 +0: ROL +1: ROR +2: RCL +3: RCR +4: SHL/SAL +5: SHR +6: +7: SAR +EndTable + +GrpTable: Grp3_1 +0: TEST Eb,Ib +1: TEST Eb,Ib +2: NOT Eb +3: NEG Eb +4: MUL AL,Eb +5: IMUL AL,Eb +6: DIV AL,Eb +7: IDIV AL,Eb +EndTable + +GrpTable: Grp3_2 +0: TEST Ev,Iz +1: +2: NOT Ev +3: NEG Ev +4: MUL rAX,Ev +5: IMUL rAX,Ev +6: DIV rAX,Ev +7: IDIV rAX,Ev +EndTable + +GrpTable: Grp4 +0: INC Eb +1: DEC Eb +EndTable + +GrpTable: Grp5 +0: INC Ev +1: DEC Ev +# Note: "forced64" is Intel CPU behavior (see comment about CALL insn). +2: CALLN Ev (f64) +3: CALLF Ep +4: JMPN Ev (f64) +5: JMPF Mp +6: PUSH Ev (d64) +7: +EndTable + +GrpTable: Grp6 +0: SLDT Rv/Mw +1: STR Rv/Mw +2: LLDT Ew +3: LTR Ew +4: VERR Ew +5: VERW Ew +EndTable + +GrpTable: Grp7 +0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) +1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) +2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B) | VMFUNC (100),(11B) | XEND (101)(11B) | XTEST (110)(11B) +3: LIDT Ms +4: SMSW Mw/Rv +5: rdpkru (110),(11B) | wrpkru (111),(11B) +6: LMSW Ew +7: INVLPG Mb | SWAPGS (o64),(000),(11B) | RDTSCP (001),(11B) +EndTable + +GrpTable: Grp8 +4: BT +5: BTS +6: BTR +7: BTC +EndTable + +GrpTable: Grp9 +1: CMPXCHG8B/16B Mq/Mdq +3: xrstors +4: xsavec +5: xsaves +6: VMPTRLD Mq | VMCLEAR Mq (66) | VMXON Mq (F3) | RDRAND Rv (11B) +7: VMPTRST Mq | VMPTRST Mq (F3) | RDSEED Rv (11B) +EndTable + +GrpTable: Grp10 +# all are UD1 +0: UD1 +1: UD1 +2: UD1 +3: UD1 +4: UD1 +5: UD1 +6: UD1 +7: UD1 +EndTable + +# Grp11A and Grp11B are expressed as Grp11 in Intel SDM +GrpTable: Grp11A +0: MOV Eb,Ib +7: XABORT Ib (000),(11B) +EndTable + +GrpTable: Grp11B +0: MOV Eb,Iz +7: XBEGIN Jz (000),(11B) +EndTable + +GrpTable: Grp12 +2: psrlw Nq,Ib (11B) | vpsrlw Hx,Ux,Ib (66),(11B),(v1) +4: psraw Nq,Ib (11B) | vpsraw Hx,Ux,Ib (66),(11B),(v1) +6: psllw Nq,Ib (11B) | vpsllw Hx,Ux,Ib (66),(11B),(v1) +EndTable + +GrpTable: Grp13 +0: vprord/q Hx,Wx,Ib (66),(ev) +1: vprold/q Hx,Wx,Ib (66),(ev) +2: psrld Nq,Ib (11B) | vpsrld Hx,Ux,Ib (66),(11B),(v1) +4: psrad Nq,Ib (11B) | vpsrad Hx,Ux,Ib (66),(11B),(v1) | vpsrad/q Hx,Ux,Ib (66),(evo) +6: pslld Nq,Ib (11B) | vpslld Hx,Ux,Ib (66),(11B),(v1) +EndTable + +GrpTable: Grp14 +2: psrlq Nq,Ib (11B) | vpsrlq Hx,Ux,Ib (66),(11B),(v1) +3: vpsrldq Hx,Ux,Ib (66),(11B),(v1) +6: psllq Nq,Ib (11B) | vpsllq Hx,Ux,Ib (66),(11B),(v1) +7: vpslldq Hx,Ux,Ib (66),(11B),(v1) +EndTable + +GrpTable: Grp15 +0: fxsave | RDFSBASE Ry (F3),(11B) +1: fxstor | RDGSBASE Ry (F3),(11B) +2: vldmxcsr Md (v1) | WRFSBASE Ry (F3),(11B) +3: vstmxcsr Md (v1) | WRGSBASE Ry (F3),(11B) +4: XSAVE | ptwrite Ey (F3),(11B) +5: XRSTOR | lfence (11B) +6: XSAVEOPT | clwb (66) | mfence (11B) +7: clflush | clflushopt (66) | sfence (11B) +EndTable + +GrpTable: Grp16 +0: prefetch NTA +1: prefetch T0 +2: prefetch T1 +3: prefetch T2 +EndTable + +GrpTable: Grp17 +1: BLSR By,Ey (v) +2: BLSMSK By,Ey (v) +3: BLSI By,Ey (v) +EndTable + +GrpTable: Grp18 +1: vgatherpf0dps/d Wx (66),(ev) +2: vgatherpf1dps/d Wx (66),(ev) +5: vscatterpf0dps/d Wx (66),(ev) +6: vscatterpf1dps/d Wx (66),(ev) +EndTable + +GrpTable: Grp19 +1: vgatherpf0qps/d Wx (66),(ev) +2: vgatherpf1qps/d Wx (66),(ev) +5: vscatterpf0qps/d Wx (66),(ev) +6: vscatterpf1qps/d Wx (66),(ev) +EndTable + +# AMD's Prefetch Group +GrpTable: GrpP +0: PREFETCH +1: PREFETCHW +EndTable + +GrpTable: GrpPDLK +0: MONTMUL +1: XSHA1 +2: XSHA2 +EndTable + +GrpTable: GrpRNG +0: xstore-rng +1: xcrypt-ecb +2: xcrypt-cbc +4: xcrypt-cfb +5: xcrypt-ofb +EndTable diff --git a/tools/objtool/arch/x86/tools/gen-insn-attr-x86.awk b/tools/objtool/arch/x86/tools/gen-insn-attr-x86.awk new file mode 100644 index 000000000000..b02a36b2c14f --- /dev/null +++ b/tools/objtool/arch/x86/tools/gen-insn-attr-x86.awk @@ -0,0 +1,393 @@ +#!/bin/awk -f +# SPDX-License-Identifier: GPL-2.0 +# gen-insn-attr-x86.awk: Instruction attribute table generator +# Written by Masami Hiramatsu <mhiramat(a)redhat.com> +# +# Usage: awk -f gen-insn-attr-x86.awk x86-opcode-map.txt > inat-tables.c + +# Awk implementation sanity check +function check_awk_implement() { + if (sprintf("%x", 0) != "0") + return "Your awk has a printf-format problem." + return "" +} + +# Clear working vars +function clear_vars() { + delete table + delete lptable2 + delete lptable1 + delete lptable3 + eid = -1 # escape id + gid = -1 # group id + aid = -1 # AVX id + tname = "" +} + +BEGIN { + # Implementation error checking + awkchecked = check_awk_implement() + if (awkchecked != "") { + print "Error: " awkchecked > "/dev/stderr" + print "Please try to use gawk." > "/dev/stderr" + exit 1 + } + + # Setup generating tables + print "/* x86 opcode map generated from x86-opcode-map.txt */" + print "/* Do not change this code. */\n" + ggid = 1 + geid = 1 + gaid = 0 + delete etable + delete gtable + delete atable + + opnd_expr = "^[A-Za-z/]" + ext_expr = "^\$" + sep_expr = "^\\|$" + group_expr = "^Grp[0-9A-Za-z]+" + + imm_expr = "^[IJAOL][a-z]" + imm_flag["Ib"] = "INAT_MAKE_IMM(INAT_IMM_BYTE)" + imm_flag["Jb"] = "INAT_MAKE_IMM(INAT_IMM_BYTE)" + imm_flag["Iw"] = "INAT_MAKE_IMM(INAT_IMM_WORD)" + imm_flag["Id"] = "INAT_MAKE_IMM(INAT_IMM_DWORD)" + imm_flag["Iq"] = "INAT_MAKE_IMM(INAT_IMM_QWORD)" + imm_flag["Ap"] = "INAT_MAKE_IMM(INAT_IMM_PTR)" + imm_flag["Iz"] = "INAT_MAKE_IMM(INAT_IMM_VWORD32)" + imm_flag["Jz"] = "INAT_MAKE_IMM(INAT_IMM_VWORD32)" + imm_flag["Iv"] = "INAT_MAKE_IMM(INAT_IMM_VWORD)" + imm_flag["Ob"] = "INAT_MOFFSET" + imm_flag["Ov"] = "INAT_MOFFSET" + imm_flag["Lx"] = "INAT_MAKE_IMM(INAT_IMM_BYTE)" + + modrm_expr = "^([CDEGMNPQRSUVW/][a-z]+|NTA|T[012])" + force64_expr = "\\([df]64\$" + rex_expr = "^REX(\\.[XRWB]+)*" + fpu_expr = "^ESC" # TODO + + lprefix1_expr = "\$(66|!F3)\$" + lprefix2_expr = "\$F3\$" + lprefix3_expr = "\$(F2|!F3|66\\&F2)\$" + lprefix_expr = "\$(66|F2|F3)\$" + max_lprefix = 4 + + # All opcodes starting with lower-case 'v', 'k' or with (v1) superscript + # accepts VEX prefix + vexok_opcode_expr = "^[vk].*" + vexok_expr = "\$v1\$" + # All opcodes with (v) superscript supports *only* VEX prefix + vexonly_expr = "\$v\$" + # All opcodes with (ev) superscript supports *only* EVEX prefix + evexonly_expr = "\$ev\$" + + prefix_expr = "\$Prefix\$" + prefix_num["Operand-Size"] = "INAT_PFX_OPNDSZ" + prefix_num["REPNE"] = "INAT_PFX_REPNE" + prefix_num["REP/REPE"] = "INAT_PFX_REPE" + prefix_num["XACQUIRE"] = "INAT_PFX_REPNE" + prefix_num["XRELEASE"] = "INAT_PFX_REPE" + prefix_num["LOCK"] = "INAT_PFX_LOCK" + prefix_num["SEG=CS"] = "INAT_PFX_CS" + prefix_num["SEG=DS"] = "INAT_PFX_DS" + prefix_num["SEG=ES"] = "INAT_PFX_ES" + prefix_num["SEG=FS"] = "INAT_PFX_FS" + prefix_num["SEG=GS"] = "INAT_PFX_GS" + prefix_num["SEG=SS"] = "INAT_PFX_SS" + prefix_num["Address-Size"] = "INAT_PFX_ADDRSZ" + prefix_num["VEX+1byte"] = "INAT_PFX_VEX2" + prefix_num["VEX+2byte"] = "INAT_PFX_VEX3" + prefix_num["EVEX"] = "INAT_PFX_EVEX" + + clear_vars() +} + +function semantic_error(msg) { + print "Semantic error at " NR ": " msg > "/dev/stderr" + exit 1 +} + +function debug(msg) { + print "DEBUG: " msg +} + +function array_size(arr, i,c) { + c = 0 + for (i in arr) + c++ + return c +} + +/^Table:/ { + print "/* " $0 " */" + if (tname != "") + semantic_error("Hit Table: before EndTable:."); +} + +/^Referrer:/ { + if (NF != 1) { + # escape opcode table + ref = "" + for (i = 2; i <= NF; i++) + ref = ref $i + eid = escape[ref] + tname = sprintf("inat_escape_table_%d", eid) + } +} + +/^AVXcode:/ { + if (NF != 1) { + # AVX/escape opcode table + aid = $2 + if (gaid <= aid) + gaid = aid + 1 + if (tname == "") # AVX only opcode table + tname = sprintf("inat_avx_table_%d", $2) + } + if (aid == -1 && eid == -1) # primary opcode table + tname = "inat_primary_table" +} + +/^GrpTable:/ { + print "/* " $0 " */" + if (!($2 in group)) + semantic_error("No group: " $2 ) + gid = group[$2] + tname = "inat_group_table_" gid +} + +function print_table(tbl,name,fmt,n) +{ + print "const insn_attr_t " name " = {" + for (i = 0; i < n; i++) { + id = sprintf(fmt, i) + if (tbl[id]) + print " [" id "] = " tbl[id] "," + } + print "};" +} + +/^EndTable/ { + if (gid != -1) { + # print group tables + if (array_size(table) != 0) { + print_table(table, tname "[INAT_GROUP_TABLE_SIZE]", + "0x%x", 8) + gtable[gid,0] = tname + } + if (array_size(lptable1) != 0) { + print_table(lptable1, tname "_1[INAT_GROUP_TABLE_SIZE]", + "0x%x", 8) + gtable[gid,1] = tname "_1" + } + if (array_size(lptable2) != 0) { + print_table(lptable2, tname "_2[INAT_GROUP_TABLE_SIZE]", + "0x%x", 8) + gtable[gid,2] = tname "_2" + } + if (array_size(lptable3) != 0) { + print_table(lptable3, tname "_3[INAT_GROUP_TABLE_SIZE]", + "0x%x", 8) + gtable[gid,3] = tname "_3" + } + } else { + # print primary/escaped tables + if (array_size(table) != 0) { + print_table(table, tname "[INAT_OPCODE_TABLE_SIZE]", + "0x%02x", 256) + etable[eid,0] = tname + if (aid >= 0) + atable[aid,0] = tname + } + if (array_size(lptable1) != 0) { + print_table(lptable1,tname "_1[INAT_OPCODE_TABLE_SIZE]", + "0x%02x", 256) + etable[eid,1] = tname "_1" + if (aid >= 0) + atable[aid,1] = tname "_1" + } + if (array_size(lptable2) != 0) { + print_table(lptable2,tname "_2[INAT_OPCODE_TABLE_SIZE]", + "0x%02x", 256) + etable[eid,2] = tname "_2" + if (aid >= 0) + atable[aid,2] = tname "_2" + } + if (array_size(lptable3) != 0) { + print_table(lptable3,tname "_3[INAT_OPCODE_TABLE_SIZE]", + "0x%02x", 256) + etable[eid,3] = tname "_3" + if (aid >= 0) + atable[aid,3] = tname "_3" + } + } + print "" + clear_vars() +} + +function add_flags(old,new) { + if (old && new) + return old " | " new + else if (old) + return old + else + return new +} + +# convert operands to flags. +function convert_operands(count,opnd, i,j,imm,mod) +{ + imm = null + mod = null + for (j = 1; j <= count; j++) { + i = opnd[j] + if (match(i, imm_expr) == 1) { + if (!imm_flag[i]) + semantic_error("Unknown imm opnd: " i) + if (imm) { + if (i != "Ib") + semantic_error("Second IMM error") + imm = add_flags(imm, "INAT_SCNDIMM") + } else + imm = imm_flag[i] + } else if (match(i, modrm_expr)) + mod = "INAT_MODRM" + } + return add_flags(imm, mod) +} + +/^[0-9a-f]+\:/ { + if (NR == 1) + next + # get index + idx = "0x" substr($1, 1, index($1,":") - 1) + if (idx in table) + semantic_error("Redefine " idx " in " tname) + + # check if escaped opcode + if ("escape" == $2) { + if ($3 != "#") + semantic_error("No escaped name") + ref = "" + for (i = 4; i <= NF; i++) + ref = ref $i + if (ref in escape) + semantic_error("Redefine escape (" ref ")") + escape[ref] = geid + geid++ + table[idx] = "INAT_MAKE_ESCAPE(" escape[ref] ")" + next + } + + variant = null + # converts + i = 2 + while (i <= NF) { + opcode = $(i++) + delete opnds + ext = null + flags = null + opnd = null + # parse one opcode + if (match($i, opnd_expr)) { + opnd = $i + count = split($(i++), opnds, ",") + flags = convert_operands(count, opnds) + } + if (match($i, ext_expr)) + ext = $(i++) + if (match($i, sep_expr)) + i++ + else if (i < NF) + semantic_error($i " is not a separator") + + # check if group opcode + if (match(opcode, group_expr)) { + if (!(opcode in group)) { + group[opcode] = ggid + ggid++ + } + flags = add_flags(flags, "INAT_MAKE_GROUP(" group[opcode] ")") + } + # check force(or default) 64bit + if (match(ext, force64_expr)) + flags = add_flags(flags, "INAT_FORCE64") + + # check REX prefix + if (match(opcode, rex_expr)) + flags = add_flags(flags, "INAT_MAKE_PREFIX(INAT_PFX_REX)") + + # check coprocessor escape : TODO + if (match(opcode, fpu_expr)) + flags = add_flags(flags, "INAT_MODRM") + + # check VEX codes + if (match(ext, evexonly_expr)) + flags = add_flags(flags, "INAT_VEXOK | INAT_EVEXONLY") + else if (match(ext, vexonly_expr)) + flags = add_flags(flags, "INAT_VEXOK | INAT_VEXONLY") + else if (match(ext, vexok_expr) || match(opcode, vexok_opcode_expr)) + flags = add_flags(flags, "INAT_VEXOK") + + # check prefixes + if (match(ext, prefix_expr)) { + if (!prefix_num[opcode]) + semantic_error("Unknown prefix: " opcode) + flags = add_flags(flags, "INAT_MAKE_PREFIX(" prefix_num[opcode] ")") + } + if (length(flags) == 0) + continue + # check if last prefix + if (match(ext, lprefix1_expr)) { + lptable1[idx] = add_flags(lptable1[idx],flags) + variant = "INAT_VARIANT" + } + if (match(ext, lprefix2_expr)) { + lptable2[idx] = add_flags(lptable2[idx],flags) + variant = "INAT_VARIANT" + } + if (match(ext, lprefix3_expr)) { + lptable3[idx] = add_flags(lptable3[idx],flags) + variant = "INAT_VARIANT" + } + if (!match(ext, lprefix_expr)){ + table[idx] = add_flags(table[idx],flags) + } + } + if (variant) + table[idx] = add_flags(table[idx],variant) +} + +END { + if (awkchecked != "") + exit 1 + # print escape opcode map's array + print "/* Escape opcode map array */" + print "const insn_attr_t * const inat_escape_tables[INAT_ESC_MAX + 1]" \ + "[INAT_LSTPFX_MAX + 1] = {" + for (i = 0; i < geid; i++) + for (j = 0; j < max_lprefix; j++) + if (etable[i,j]) + print " ["i"]["j"] = "etable[i,j]"," + print "};\n" + # print group opcode map's array + print "/* Group opcode map array */" + print "const insn_attr_t * const inat_group_tables[INAT_GRP_MAX + 1]"\ + "[INAT_LSTPFX_MAX + 1] = {" + for (i = 0; i < ggid; i++) + for (j = 0; j < max_lprefix; j++) + if (gtable[i,j]) + print " ["i"]["j"] = "gtable[i,j]"," + print "};\n" + # print AVX opcode map's array + print "/* AVX opcode map array */" + print "const insn_attr_t * const inat_avx_tables[X86_VEX_M_MAX + 1]"\ + "[INAT_LSTPFX_MAX + 1] = {" + for (i = 0; i < gaid; i++) + for (j = 0; j < max_lprefix; j++) + if (atable[i,j]) + print " ["i"]["j"] = "atable[i,j]"," + print "};" +} + diff --git a/tools/objtool/orc.h b/tools/objtool/orc.h index a4139e386ef3..b0e92a6d0903 100644 --- a/tools/objtool/orc.h +++ b/tools/objtool/orc.h @@ -18,7 +18,7 @@ #ifndef _ORC_H #define _ORC_H -#include "orc_types.h" +#include <asm/orc_types.h> struct objtool_file; diff --git a/tools/objtool/orc_dump.c b/tools/objtool/orc_dump.c index 36c5bf6a2675..c3343820916a 100644 --- a/tools/objtool/orc_dump.c +++ b/tools/objtool/orc_dump.c @@ -76,7 +76,8 @@ int orc_dump(const char *_objname) int fd, nr_entries, i, *orc_ip = NULL, orc_size = 0; struct orc_entry *orc = NULL; char *name; - unsigned long nr_sections, orc_ip_addr = 0; + size_t nr_sections; + Elf64_Addr orc_ip_addr = 0; size_t shstrtab_idx; Elf *elf; Elf_Scn *scn; @@ -187,10 +188,10 @@ int orc_dump(const char *_objname) return -1; } - printf("%s+%lx:", name, rela.r_addend); + printf("%s+%llx:", name, (unsigned long long)rela.r_addend); } else { - printf("%lx:", orc_ip_addr + (i * sizeof(int)) + orc_ip[i]); + printf("%llx:", (unsigned long long)(orc_ip_addr + (i * sizeof(int)) + orc_ip[i])); } diff --git a/tools/objtool/orc_types.h b/tools/objtool/orc_types.h deleted file mode 100644 index 9c9dc579bd7d..000000000000 --- a/tools/objtool/orc_types.h +++ /dev/null @@ -1,107 +0,0 @@ -/* - * Copyright (C) 2017 Josh Poimboeuf <jpoimboe(a)redhat.com> - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version 2 - * of the License, or (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef _ORC_TYPES_H -#define _ORC_TYPES_H - -#include <linux/types.h> -#include <linux/compiler.h> - -/* - * The ORC_REG_* registers are base registers which are used to find other - * registers on the stack. - * - * ORC_REG_PREV_SP, also known as DWARF Call Frame Address (CFA), is the - * address of the previous frame: the caller's SP before it called the current - * function. - * - * ORC_REG_UNDEFINED means the corresponding register's value didn't change in - * the current frame. - * - * The most commonly used base registers are SP and BP -- which the previous SP - * is usually based on -- and PREV_SP and UNDEFINED -- which the previous BP is - * usually based on. - * - * The rest of the base registers are needed for special cases like entry code - * and GCC realigned stacks. - */ -#define ORC_REG_UNDEFINED 0 -#define ORC_REG_PREV_SP 1 -#define ORC_REG_DX 2 -#define ORC_REG_DI 3 -#define ORC_REG_BP 4 -#define ORC_REG_SP 5 -#define ORC_REG_R10 6 -#define ORC_REG_R13 7 -#define ORC_REG_BP_INDIRECT 8 -#define ORC_REG_SP_INDIRECT 9 -#define ORC_REG_MAX 15 - -/* - * ORC_TYPE_CALL: Indicates that sp_reg+sp_offset resolves to PREV_SP (the - * caller's SP right before it made the call). Used for all callable - * functions, i.e. all C code and all callable asm functions. - * - * ORC_TYPE_REGS: Used in entry code to indicate that sp_reg+sp_offset points - * to a fully populated pt_regs from a syscall, interrupt, or exception. - * - * ORC_TYPE_REGS_IRET: Used in entry code to indicate that sp_reg+sp_offset - * points to the iret return frame. - * - * The UNWIND_HINT macros are used only for the unwind_hint struct. They - * aren't used in struct orc_entry due to size and complexity constraints. - * Objtool converts them to real types when it converts the hints to orc - * entries. - */ -#define ORC_TYPE_CALL 0 -#define ORC_TYPE_REGS 1 -#define ORC_TYPE_REGS_IRET 2 -#define UNWIND_HINT_TYPE_SAVE 3 -#define UNWIND_HINT_TYPE_RESTORE 4 - -#ifndef __ASSEMBLY__ -/* - * This struct is more or less a vastly simplified version of the DWARF Call - * Frame Information standard. It contains only the necessary parts of DWARF - * CFI, simplified for ease of access by the in-kernel unwinder. It tells the - * unwinder how to find the previous SP and BP (and sometimes entry regs) on - * the stack for a given code address. Each instance of the struct corresponds - * to one or more code locations. - */ -struct orc_entry { - s16 sp_offset; - s16 bp_offset; - unsigned sp_reg:4; - unsigned bp_reg:4; - unsigned type:2; -} __packed; - -/* - * This struct is used by asm and inline asm code to manually annotate the - * location of registers on the stack for the ORC unwinder. - * - * Type can be either ORC_TYPE_* or UNWIND_HINT_TYPE_*. - */ -struct unwind_hint { - u32 ip; - s16 sp_offset; - u8 sp_reg; - u8 type; -}; -#endif /* __ASSEMBLY__ */ - -#endif /* _ORC_TYPES_H */ diff --git a/tools/objtool/sync-check.sh b/tools/objtool/sync-check.sh new file mode 100755 index 000000000000..1470e74e9d66 --- /dev/null +++ b/tools/objtool/sync-check.sh @@ -0,0 +1,29 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 + +FILES=' +arch/x86/lib/insn.c +arch/x86/lib/inat.c +arch/x86/lib/x86-opcode-map.txt +arch/x86/tools/gen-insn-attr-x86.awk +arch/x86/include/asm/insn.h +arch/x86/include/asm/inat.h +arch/x86/include/asm/inat_types.h +arch/x86/include/asm/orc_types.h +' + +check() +{ + local file=$1 + + diff $file ../../$file > /dev/null || + echo "Warning: synced file at 'tools/objtool/$file' differs from latest kernel version at '$file'" +} + +if [ ! -d ../../kernel ] || [ ! -d ../../tools ] || [ ! -d ../objtool ]; then + exit 0 +fi + +for i in $FILES; do + check $i +done diff --git a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt index 12e377184ee4..e0b85930dd77 100644 --- a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt +++ b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt @@ -607,7 +607,7 @@ fb: psubq Pq,Qq | vpsubq Vx,Hx,Wx (66),(v1) fc: paddb Pq,Qq | vpaddb Vx,Hx,Wx (66),(v1) fd: paddw Pq,Qq | vpaddw Vx,Hx,Wx (66),(v1) fe: paddd Pq,Qq | vpaddd Vx,Hx,Wx (66),(v1) -ff: +ff: UD0 EndTable Table: 3-byte opcode 1 (0x0f 0x38) @@ -717,7 +717,7 @@ AVXcode: 2 7e: vpermt2d/q Vx,Hx,Wx (66),(ev) 7f: vpermt2ps/d Vx,Hx,Wx (66),(ev) 80: INVEPT Gy,Mdq (66) -81: INVPID Gy,Mdq (66) +81: INVVPID Gy,Mdq (66) 82: INVPCID Gy,Mdq (66) 83: vpmultishiftqb Vx,Hx,Wx (66),(ev) 88: vexpandps/d Vpd,Wpd (66),(ev) @@ -896,7 +896,7 @@ EndTable GrpTable: Grp3_1 0: TEST Eb,Ib -1: +1: TEST Eb,Ib 2: NOT Eb 3: NEG Eb 4: MUL AL,Eb @@ -970,6 +970,15 @@ GrpTable: Grp9 EndTable GrpTable: Grp10 +# all are UD1 +0: UD1 +1: UD1 +2: UD1 +3: UD1 +4: UD1 +5: UD1 +6: UD1 +7: UD1 EndTable # Grp11A and Grp11B are expressed as Grp11 in Intel SDM diff --git a/tools/testing/selftests/x86/ldt_gdt.c b/tools/testing/selftests/x86/ldt_gdt.c index 66e5ce5b91f0..0304ffb714f2 100644 --- a/tools/testing/selftests/x86/ldt_gdt.c +++ b/tools/testing/selftests/x86/ldt_gdt.c @@ -627,13 +627,10 @@ static void do_multicpu_tests(void) static int finish_exec_test(void) { /* - * In a sensible world, this would be check_invalid_segment(0, 1); - * For better or for worse, though, the LDT is inherited across exec. - * We can probably change this safely, but for now we test it. + * Older kernel versions did inherit the LDT on exec() which is + * wrong because exec() starts from a clean state. */ - check_valid_segment(0, 1, - AR_DPL3 | AR_TYPE_XRCODE | AR_S | AR_P | AR_DB, - 42, true); + check_invalid_segment(0, 1); return nerrs ? 1 : 0; } diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c index b36945d49986..b4b69c2d1012 100644 --- a/virt/kvm/arm/mmu.c +++ b/virt/kvm/arm/mmu.c @@ -509,8 +509,6 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size) */ void free_hyp_pgds(void) { - unsigned long addr; - mutex_lock(&kvm_hyp_pgd_mutex); if (boot_hyp_pgd) { @@ -521,10 +519,10 @@ void free_hyp_pgds(void) if (hyp_pgd) { unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE); - for (addr = PAGE_OFFSET; virt_addr_valid(addr); addr += PGDIR_SIZE) - unmap_hyp_range(hyp_pgd, kern_hyp_va(addr), PGDIR_SIZE); - for (addr = VMALLOC_START; is_vmalloc_addr((void*)addr); addr += PGDIR_SIZE) - unmap_hyp_range(hyp_pgd, kern_hyp_va(addr), PGDIR_SIZE); + unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET), + (uintptr_t)high_memory - PAGE_OFFSET); + unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START), + VMALLOC_END - VMALLOC_START); free_pages((unsigned long)hyp_pgd, hyp_pgd_order); hyp_pgd = NULL;

7 years, 8 months

1
0
0 0

Re: [Linux-stable-mirror] Linux 4.9.73

by Greg KH

diff --git a/Makefile b/Makefile index 78dde51d9d74..64eb0bf614ee 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ VERSION = 4 PATCHLEVEL = 9 -SUBLEVEL = 72 +SUBLEVEL = 73 EXTRAVERSION = NAME = Roaring Lionus diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index 72c27b8d2cf3..083f92746951 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -401,8 +401,12 @@ static __u64 power_pmu_bhrb_to(u64 addr) int ret; __u64 target; - if (is_kernel_addr(addr)) - return branch_target((unsigned int *)addr); + if (is_kernel_addr(addr)) { + if (probe_kernel_read(&instr, (void *)addr, sizeof(instr))) + return 0; + + return branch_target(&instr); + } /* Userspace: need copy instruction here then translate it */ pagefault_disable(); diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 72b737b8c9d6..c8f8dd8ca0a1 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -2395,9 +2395,21 @@ static int rsm_load_seg_64(struct x86_emulate_ctxt *ctxt, u64 smbase, int n) } static int rsm_enter_protected_mode(struct x86_emulate_ctxt *ctxt, - u64 cr0, u64 cr4) + u64 cr0, u64 cr3, u64 cr4) { int bad; + u64 pcid; + + /* In order to later set CR4.PCIDE, CR3[11:0] must be zero. */ + pcid = 0; + if (cr4 & X86_CR4_PCIDE) { + pcid = cr3 & 0xfff; + cr3 &= ~0xfff; + } + + bad = ctxt->ops->set_cr(ctxt, 3, cr3); + if (bad) + return X86EMUL_UNHANDLEABLE; /* * First enable PAE, long mode needs it before CR0.PG = 1 is set. @@ -2416,6 +2428,12 @@ static int rsm_enter_protected_mode(struct x86_emulate_ctxt *ctxt, bad = ctxt->ops->set_cr(ctxt, 4, cr4); if (bad) return X86EMUL_UNHANDLEABLE; + if (pcid) { + bad = ctxt->ops->set_cr(ctxt, 3, cr3 | pcid); + if (bad) + return X86EMUL_UNHANDLEABLE; + } + } return X86EMUL_CONTINUE; @@ -2426,11 +2444,11 @@ static int rsm_load_state_32(struct x86_emulate_ctxt *ctxt, u64 smbase) struct desc_struct desc; struct desc_ptr dt; u16 selector; - u32 val, cr0, cr4; + u32 val, cr0, cr3, cr4; int i; cr0 = GET_SMSTATE(u32, smbase, 0x7ffc); - ctxt->ops->set_cr(ctxt, 3, GET_SMSTATE(u32, smbase, 0x7ff8)); + cr3 = GET_SMSTATE(u32, smbase, 0x7ff8); ctxt->eflags = GET_SMSTATE(u32, smbase, 0x7ff4) | X86_EFLAGS_FIXED; ctxt->_eip = GET_SMSTATE(u32, smbase, 0x7ff0); @@ -2472,14 +2490,14 @@ static int rsm_load_state_32(struct x86_emulate_ctxt *ctxt, u64 smbase) ctxt->ops->set_smbase(ctxt, GET_SMSTATE(u32, smbase, 0x7ef8)); - return rsm_enter_protected_mode(ctxt, cr0, cr4); + return rsm_enter_protected_mode(ctxt, cr0, cr3, cr4); } static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt, u64 smbase) { struct desc_struct desc; struct desc_ptr dt; - u64 val, cr0, cr4; + u64 val, cr0, cr3, cr4; u32 base3; u16 selector; int i, r; @@ -2496,7 +2514,7 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt, u64 smbase) ctxt->ops->set_dr(ctxt, 7, (val & DR7_VOLATILE) | DR7_FIXED_1); cr0 = GET_SMSTATE(u64, smbase, 0x7f58); - ctxt->ops->set_cr(ctxt, 3, GET_SMSTATE(u64, smbase, 0x7f50)); + cr3 = GET_SMSTATE(u64, smbase, 0x7f50); cr4 = GET_SMSTATE(u64, smbase, 0x7f48); ctxt->ops->set_smbase(ctxt, GET_SMSTATE(u32, smbase, 0x7f00)); val = GET_SMSTATE(u64, smbase, 0x7ed0); @@ -2524,7 +2542,7 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt, u64 smbase) dt.address = GET_SMSTATE(u64, smbase, 0x7e68); ctxt->ops->set_gdt(ctxt, &dt); - r = rsm_enter_protected_mode(ctxt, cr0, cr4); + r = rsm_enter_protected_mode(ctxt, cr0, cr3, cr4); if (r != X86EMUL_CONTINUE) return r; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index f4d893713d54..7e28e6c877d9 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7132,7 +7132,7 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) #endif kvm_rip_write(vcpu, regs->rip); - kvm_set_rflags(vcpu, regs->rflags); + kvm_set_rflags(vcpu, regs->rflags | X86_EFLAGS_FIXED); vcpu->arch.exception.pending = false; diff --git a/crypto/mcryptd.c b/crypto/mcryptd.c index c207458d6299..a14100e74754 100644 --- a/crypto/mcryptd.c +++ b/crypto/mcryptd.c @@ -80,6 +80,7 @@ static int mcryptd_init_queue(struct mcryptd_queue *queue, pr_debug("cpu_queue #%d %p\n", cpu, queue->cpu_queue); crypto_init_queue(&cpu_queue->queue, max_cpu_qlen); INIT_WORK(&cpu_queue->work, mcryptd_queue_worker); + spin_lock_init(&cpu_queue->q_lock); } return 0; } @@ -103,15 +104,16 @@ static int mcryptd_enqueue_request(struct mcryptd_queue *queue, int cpu, err; struct mcryptd_cpu_queue *cpu_queue; - cpu = get_cpu(); - cpu_queue = this_cpu_ptr(queue->cpu_queue); - rctx->tag.cpu = cpu; + cpu_queue = raw_cpu_ptr(queue->cpu_queue); + spin_lock(&cpu_queue->q_lock); + cpu = smp_processor_id(); + rctx->tag.cpu = smp_processor_id(); err = crypto_enqueue_request(&cpu_queue->queue, request); pr_debug("enqueue request: cpu %d cpu_queue %p request %p\n", cpu, cpu_queue, request); + spin_unlock(&cpu_queue->q_lock); queue_work_on(cpu, kcrypto_wq, &cpu_queue->work); - put_cpu(); return err; } @@ -160,16 +162,11 @@ static void mcryptd_queue_worker(struct work_struct *work) cpu_queue = container_of(work, struct mcryptd_cpu_queue, work); i = 0; while (i < MCRYPTD_BATCH || single_task_running()) { - /* - * preempt_disable/enable is used to prevent - * being preempted by mcryptd_enqueue_request() - */ - local_bh_disable(); - preempt_disable(); + + spin_lock_bh(&cpu_queue->q_lock); backlog = crypto_get_backlog(&cpu_queue->queue); req = crypto_dequeue_request(&cpu_queue->queue); - preempt_enable(); - local_bh_enable(); + spin_unlock_bh(&cpu_queue->q_lock); if (!req) { mcryptd_opportunistic_flush(); @@ -184,7 +181,7 @@ static void mcryptd_queue_worker(struct work_struct *work) ++i; } if (cpu_queue->queue.qlen) - queue_work(kcrypto_wq, &cpu_queue->work); + queue_work_on(smp_processor_id(), kcrypto_wq, &cpu_queue->work); } void mcryptd_flusher(struct work_struct *__work) diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c index ec4f507b524f..4558cc73abf3 100644 --- a/drivers/acpi/apei/erst.c +++ b/drivers/acpi/apei/erst.c @@ -1020,7 +1020,7 @@ static ssize_t erst_reader(u64 *id, enum pstore_type_id *type, int *count, /* The record may be cleared by others, try read next record */ if (len == -ENOENT) goto skip; - else if (len < sizeof(*rcd)) { + else if (len < 0 || len < sizeof(*rcd)) { rc = -EIO; goto out; } diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index f3bc901ac930..fe03d00de22b 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -1390,6 +1390,11 @@ static int acpi_nfit_add_dimm(struct acpi_nfit_desc *acpi_desc, dev_name(&adev_dimm->dev)); return -ENXIO; } + /* + * Record nfit_mem for the notification path to track back to + * the nfit sysfs attributes for this dimm device object. + */ + dev_set_drvdata(&adev_dimm->dev, nfit_mem); /* * Until standardization materializes we need to consider 4 @@ -1446,9 +1451,11 @@ static void shutdown_dimm_notify(void *data) sysfs_put(nfit_mem->flags_attr); nfit_mem->flags_attr = NULL; } - if (adev_dimm) + if (adev_dimm) { acpi_remove_notify_handler(adev_dimm->handle, ACPI_DEVICE_NOTIFY, acpi_nvdimm_notify); + dev_set_drvdata(&adev_dimm->dev, NULL); + } } mutex_unlock(&acpi_desc->init_mutex); } diff --git a/drivers/clk/sunxi/clk-sun9i-mmc.c b/drivers/clk/sunxi/clk-sun9i-mmc.c index 6041bdba2e97..f69f9e8c6f38 100644 --- a/drivers/clk/sunxi/clk-sun9i-mmc.c +++ b/drivers/clk/sunxi/clk-sun9i-mmc.c @@ -16,6 +16,7 @@ #include <linux/clk.h> #include <linux/clk-provider.h> +#include <linux/delay.h> #include <linux/init.h> #include <linux/of.h> #include <linux/of_device.h> @@ -83,9 +84,20 @@ static int sun9i_mmc_reset_deassert(struct reset_controller_dev *rcdev, return 0; } +static int sun9i_mmc_reset_reset(struct reset_controller_dev *rcdev, + unsigned long id) +{ + sun9i_mmc_reset_assert(rcdev, id); + udelay(10); + sun9i_mmc_reset_deassert(rcdev, id); + + return 0; +} + static const struct reset_control_ops sun9i_mmc_reset_ops = { .assert = sun9i_mmc_reset_assert, .deassert = sun9i_mmc_reset_deassert, + .reset = sun9i_mmc_reset_reset, }; static int sun9i_a80_mmc_config_clk_probe(struct platform_device *pdev) diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c index a518832ed5f5..59dbdaa24c28 100644 --- a/drivers/mfd/cros_ec_spi.c +++ b/drivers/mfd/cros_ec_spi.c @@ -664,6 +664,7 @@ static int cros_ec_spi_probe(struct spi_device *spi) sizeof(struct ec_response_get_protocol_info); ec_dev->dout_size = sizeof(struct ec_host_request); + ec_spi->last_transfer_ns = ktime_get_ns(); err = cros_ec_register(ec_dev); if (err) { diff --git a/drivers/mfd/twl4030-audio.c b/drivers/mfd/twl4030-audio.c index 0a1606480023..cc832d309599 100644 --- a/drivers/mfd/twl4030-audio.c +++ b/drivers/mfd/twl4030-audio.c @@ -159,13 +159,18 @@ unsigned int twl4030_audio_get_mclk(void) EXPORT_SYMBOL_GPL(twl4030_audio_get_mclk); static bool twl4030_audio_has_codec(struct twl4030_audio_data *pdata, - struct device_node *node) + struct device_node *parent) { + struct device_node *node; + if (pdata && pdata->codec) return true; - if (of_find_node_by_name(node, "codec")) + node = of_get_child_by_name(parent, "codec"); + if (node) { + of_node_put(node); return true; + } return false; } diff --git a/drivers/mfd/twl6040.c b/drivers/mfd/twl6040.c index d66502d36ba0..dd19f17a1b63 100644 --- a/drivers/mfd/twl6040.c +++ b/drivers/mfd/twl6040.c @@ -97,12 +97,16 @@ static struct reg_sequence twl6040_patch[] = { }; -static bool twl6040_has_vibra(struct device_node *node) +static bool twl6040_has_vibra(struct device_node *parent) { -#ifdef CONFIG_OF - if (of_find_node_by_name(node, "vibra")) + struct device_node *node; + + node = of_get_child_by_name(parent, "vibra"); + if (node) { + of_node_put(node); return true; -#endif + } + return false; } diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index 6ea10a9f33e8..fa463268d019 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -1182,6 +1182,10 @@ static void mvneta_port_disable(struct mvneta_port *pp) val &= ~MVNETA_GMAC0_PORT_ENABLE; mvreg_write(pp, MVNETA_GMAC_CTRL_0, val); + pp->link = 0; + pp->duplex = -1; + pp->speed = 0; + udelay(200); } @@ -1905,9 +1909,9 @@ static int mvneta_rx_swbm(struct mvneta_port *pp, int rx_todo, if (!mvneta_rxq_desc_is_first_last(rx_status) || (rx_status & MVNETA_RXD_ERR_SUMMARY)) { + mvneta_rx_error(pp, rx_desc); err_drop_frame: dev->stats.rx_errors++; - mvneta_rx_error(pp, rx_desc); /* leave the descriptor untouched */ continue; } @@ -2922,7 +2926,7 @@ static void mvneta_cleanup_rxqs(struct mvneta_port *pp) { int queue; - for (queue = 0; queue < txq_number; queue++) + for (queue = 0; queue < rxq_number; queue++) mvneta_rxq_deinit(pp, &pp->rxqs[queue]); } diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index 71eb6c637b60..42abdd2391c9 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -352,9 +352,9 @@ struct device *nd_pfn_create(struct nd_region *nd_region) int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) { u64 checksum, offset; - unsigned long align; enum nd_pfn_mode mode; struct nd_namespace_io *nsio; + unsigned long align, start_pad; struct nd_pfn_sb *pfn_sb = nd_pfn->pfn_sb; struct nd_namespace_common *ndns = nd_pfn->ndns; const u8 *parent_uuid = nd_dev_to_uuid(&ndns->dev); @@ -398,6 +398,7 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) align = le32_to_cpu(pfn_sb->align); offset = le64_to_cpu(pfn_sb->dataoff); + start_pad = le32_to_cpu(pfn_sb->start_pad); if (align == 0) align = 1UL << ilog2(offset); mode = le32_to_cpu(pfn_sb->mode); @@ -456,7 +457,7 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) return -EBUSY; } - if ((align && !IS_ALIGNED(offset, align)) + if ((align && !IS_ALIGNED(nsio->res.start + offset + start_pad, align)) || !IS_ALIGNED(offset, PAGE_SIZE)) { dev_err(&nd_pfn->dev, "bad offset: %#llx dax disabled align: %#lx\n", diff --git a/drivers/parisc/lba_pci.c b/drivers/parisc/lba_pci.c index bc286cbbbc9b..1cced1d039d7 100644 --- a/drivers/parisc/lba_pci.c +++ b/drivers/parisc/lba_pci.c @@ -1656,3 +1656,36 @@ void lba_set_iregs(struct parisc_device *lba, u32 ibase, u32 imask) iounmap(base_addr); } + +/* + * The design of the Diva management card in rp34x0 machines (rp3410, rp3440) + * seems rushed, so that many built-in components simply don't work. + * The following quirks disable the serial AUX port and the built-in ATI RV100 + * Radeon 7000 graphics card which both don't have any external connectors and + * thus are useless, and even worse, e.g. the AUX port occupies ttyS0 and as + * such makes those machines the only PARISC machines on which we can't use + * ttyS0 as boot console. + */ +static void quirk_diva_ati_card(struct pci_dev *dev) +{ + if (dev->subsystem_vendor != PCI_VENDOR_ID_HP || + dev->subsystem_device != 0x1292) + return; + + dev_info(&dev->dev, "Hiding Diva built-in ATI card"); + dev->device = 0; +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_RADEON_QY, + quirk_diva_ati_card); + +static void quirk_diva_aux_disable(struct pci_dev *dev) +{ + if (dev->subsystem_vendor != PCI_VENDOR_ID_HP || + dev->subsystem_device != 0x1291) + return; + + dev_info(&dev->dev, "Hiding Diva built-in AUX serial device"); + dev->device = 0; +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_DIVA_AUX, + quirk_diva_aux_disable); diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 8a68e2b554e1..802997e2ddcc 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -953,7 +953,12 @@ static int pci_pm_thaw_noirq(struct device *dev) if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_resume_early(dev); - pci_update_current_state(pci_dev, PCI_D0); + /* + * pci_restore_state() requires the device to be in D0 (because of MSI + * restoration among other things), so force it into D0 in case the + * driver's "freeze" callbacks put it into a low-power state directly. + */ + pci_set_power_state(pci_dev, PCI_D0); pci_restore_state(pci_dev); if (drv && drv->pm && drv->pm->thaw_noirq) diff --git a/drivers/pinctrl/intel/pinctrl-cherryview.c b/drivers/pinctrl/intel/pinctrl-cherryview.c index 0d34d8a4ab14..e8c08eb97530 100644 --- a/drivers/pinctrl/intel/pinctrl-cherryview.c +++ b/drivers/pinctrl/intel/pinctrl-cherryview.c @@ -1594,6 +1594,22 @@ static int chv_gpio_probe(struct chv_pinctrl *pctrl, int irq) clear_bit(i, chip->irq_valid_mask); } + /* + * The same set of machines in chv_no_valid_mask[] have incorrectly + * configured GPIOs that generate spurious interrupts so we use + * this same list to apply another quirk for them. + * + * See also https://bugzilla.kernel.org/show_bug.cgi?id=197953. + */ + if (!need_valid_mask) { + /* + * Mask all interrupts the community is able to generate + * but leave the ones that can only generate GPEs unmasked. + */ + chv_writel(GENMASK(31, pctrl->community->nirqs), + pctrl->regs + CHV_INTMASK); + } + /* Clear all interrupts */ chv_writel(0xffff, pctrl->regs + CHV_INTSTAT); diff --git a/drivers/spi/spi-xilinx.c b/drivers/spi/spi-xilinx.c index bc7100b93dfc..e0b9fe1d0e37 100644 --- a/drivers/spi/spi-xilinx.c +++ b/drivers/spi/spi-xilinx.c @@ -271,6 +271,7 @@ static int xilinx_spi_txrx_bufs(struct spi_device *spi, struct spi_transfer *t) while (remaining_words) { int n_words, tx_words, rx_words; u32 sr; + int stalled; n_words = min(remaining_words, xspi->buffer_size); @@ -299,7 +300,17 @@ static int xilinx_spi_txrx_bufs(struct spi_device *spi, struct spi_transfer *t) /* Read out all the data from the Rx FIFO */ rx_words = n_words; + stalled = 10; while (rx_words) { + if (rx_words == n_words && !(stalled--) && + !(sr & XSPI_SR_TX_EMPTY_MASK) && + (sr & XSPI_SR_RX_EMPTY_MASK)) { + dev_err(&spi->dev, + "Detected stall. Check C_SPI_MODE and C_SPI_MEMORY\n"); + xspi_init_hw(xspi); + return -EIO; + } + if ((sr & XSPI_SR_TX_EMPTY_MASK) && (rx_words > 1)) { xilinx_spi_rx(xspi); rx_words--; diff --git a/include/crypto/mcryptd.h b/include/crypto/mcryptd.h index 4a53c0d38cd2..e045722a69aa 100644 --- a/include/crypto/mcryptd.h +++ b/include/crypto/mcryptd.h @@ -26,6 +26,7 @@ static inline struct mcryptd_ahash *__mcryptd_ahash_cast( struct mcryptd_cpu_queue { struct crypto_queue queue; + spinlock_t q_lock; struct work_struct work; }; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 8b1ebe4c6aba..d7eeebfafe8d 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2722,11 +2722,12 @@ static bool states_equal(struct bpf_verifier_env *env, /* If we didn't map access then again we don't care about the * mismatched range values and it's ok if our old type was - * UNKNOWN and we didn't go to a NOT_INIT'ed reg. + * UNKNOWN and we didn't go to a NOT_INIT'ed or pointer reg. */ if (rold->type == NOT_INIT || (!varlen_map_access && rold->type == UNKNOWN_VALUE && - rcur->type != NOT_INIT)) + rcur->type != NOT_INIT && + !__is_pointer_value(env->allow_ptr_leaks, rcur))) continue; /* Don't care about the reg->id in this case. */ diff --git a/sound/core/rawmidi.c b/sound/core/rawmidi.c index b450a27588c8..16f8124b1150 100644 --- a/sound/core/rawmidi.c +++ b/sound/core/rawmidi.c @@ -579,15 +579,14 @@ static int snd_rawmidi_info_user(struct snd_rawmidi_substream *substream, return 0; } -int snd_rawmidi_info_select(struct snd_card *card, struct snd_rawmidi_info *info) +static int __snd_rawmidi_info_select(struct snd_card *card, + struct snd_rawmidi_info *info) { struct snd_rawmidi *rmidi; struct snd_rawmidi_str *pstr; struct snd_rawmidi_substream *substream; - mutex_lock(&register_mutex); rmidi = snd_rawmidi_search(card, info->device); - mutex_unlock(&register_mutex); if (!rmidi) return -ENXIO; if (info->stream < 0 || info->stream > 1) @@ -603,6 +602,16 @@ int snd_rawmidi_info_select(struct snd_card *card, struct snd_rawmidi_info *info } return -ENXIO; } + +int snd_rawmidi_info_select(struct snd_card *card, struct snd_rawmidi_info *info) +{ + int ret; + + mutex_lock(&register_mutex); + ret = __snd_rawmidi_info_select(card, info); + mutex_unlock(&register_mutex); + return ret; +} EXPORT_SYMBOL(snd_rawmidi_info_select); static int snd_rawmidi_info_select_user(struct snd_card *card, diff --git a/sound/usb/mixer.c b/sound/usb/mixer.c index 24c897f0b571..08015c139116 100644 --- a/sound/usb/mixer.c +++ b/sound/usb/mixer.c @@ -2167,20 +2167,25 @@ static int parse_audio_selector_unit(struct mixer_build *state, int unitid, kctl->private_value = (unsigned long)namelist; kctl->private_free = usb_mixer_selector_elem_free; - nameid = uac_selector_unit_iSelector(desc); + /* check the static mapping table at first */ len = check_mapped_name(map, kctl->id.name, sizeof(kctl->id.name)); - if (len) - ; - else if (nameid) - len = snd_usb_copy_string_desc(state, nameid, kctl->id.name, - sizeof(kctl->id.name)); - else - len = get_term_name(state, &state->oterm, - kctl->id.name, sizeof(kctl->id.name), 0); - if (!len) { - strlcpy(kctl->id.name, "USB", sizeof(kctl->id.name)); + /* no mapping ? */ + /* if iSelector is given, use it */ + nameid = uac_selector_unit_iSelector(desc); + if (nameid) + len = snd_usb_copy_string_desc(state, nameid, + kctl->id.name, + sizeof(kctl->id.name)); + /* ... or pick up the terminal name at next */ + if (!len) + len = get_term_name(state, &state->oterm, + kctl->id.name, sizeof(kctl->id.name), 0); + /* ... or use the fixed string "USB" as the last resort */ + if (!len) + strlcpy(kctl->id.name, "USB", sizeof(kctl->id.name)); + /* and add the proper suffix */ if (desc->bDescriptorSubtype == UAC2_CLOCK_SELECTOR) append_ctl_name(kctl, " Clock Source"); else if ((state->oterm.type & 0xff00) == 0x0100) diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index 7613b9e07b5c..1cd7f8b0bf77 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1170,10 +1170,11 @@ static bool is_marantz_denon_dac(unsigned int id) /* TEAC UD-501/UD-503/NT-503 USB DACs need a vendor cmd to switch * between PCM/DOP and native DSD mode */ -static bool is_teac_50X_dac(unsigned int id) +static bool is_teac_dsd_dac(unsigned int id) { switch (id) { case USB_ID(0x0644, 0x8043): /* TEAC UD-501/UD-503/NT-503 */ + case USB_ID(0x0644, 0x8044): /* Esoteric D-05X */ return true; } return false; @@ -1206,7 +1207,7 @@ int snd_usb_select_mode_quirk(struct snd_usb_substream *subs, break; } mdelay(20); - } else if (is_teac_50X_dac(subs->stream->chip->usb_id)) { + } else if (is_teac_dsd_dac(subs->stream->chip->usb_id)) { /* Vendor mode switch cmd is required. */ switch (fmt->altsetting) { case 3: /* DSD mode (DSD_U32) requested */ @@ -1376,7 +1377,7 @@ u64 snd_usb_interface_dsd_format_quirks(struct snd_usb_audio *chip, } /* TEAC devices with USB DAC functionality */ - if (is_teac_50X_dac(chip->usb_id)) { + if (is_teac_dsd_dac(chip->usb_id)) { if (fp->altsetting == 3) return SNDRV_PCM_FMTBIT_DSD_U32_BE; }

7 years, 8 months

1
0
0 0

Re: [Linux-stable-mirror] [patch 2/4] nohz: Prevent erroneous tick stop invocations

by Frederic Weisbecker

On Fri, Dec 22, 2017 at 03:51:13PM +0100, Thomas Gleixner wrote: > The conditions in irq_exit() to invoke tick_nohz_irq_exit() are: > > if ((idle_cpu(cpu) && !need_resched()) || tick_nohz_full_cpu(cpu)) > > This is too permissive in various aspects: > > 1) If need_resched() is set, then the tick cannot be stopped whether > the CPU is idle or in nohz full mode. That's not exactly true. In nohz full mode the tick is not restarted on the switch from idle to a single task. And if an idle interrupt wakes up a single task and enqueues a timer, we want that timer to be programmed even though we have need_resched().

7 years, 8 months

2
4
0 0

[Linux-stable-mirror] Build failures in v4.1.y-stable-queue

by Guenter Roeck

Problems reported yesterday have been fixed. New problems: Building arm:axm55xx_defconfig ... failed -------------- Error log: /opt/buildbot/slave/stable-queue-4.1/build/arch/arm/kvm/interrupts.S: Assembler messages: /opt/buildbot/slave/stable-queue-4.1/build/arch/arm/kvm/interrupts.S:339: Error: garbage following instruction -- `ldr r2,=BSYM(panic)' /opt/buildbot/slave/stable-queue-4.1/build/arch/arm/kvm/interrupts.S:343: Error: garbage following instruction -- `ldr r2,=BSYM(panic)' /opt/buildbot/slave/stable-queue-4.1/build/arch/arm/kvm/interrupts.S:347: Error: garbage following instruction -- `ldr r2,=BSYM(panic)' /opt/buildbot/slave/stable-queue-4.1/build/arch/arm/kvm/interrupts.S:351: Error: garbage following instruction -- `ldr r2,=BSYM(panic)' Looks like something is wrong with the backport of fdc31f5c95c9 ("ARM: replace BSYM() with badr assembly macro") ? Building mips:defconfig ... failed -------------- Error log: /opt/buildbot/slave/stable-queue-4.1/build/arch/mips/kernel/genex.S: Assembler messages: /opt/buildbot/slave/stable-queue-4.1/build/arch/mips/kernel/genex.S:219: Error: absolute expression required `li $9,_IRQ_STACK_SIZE' /opt/buildbot/slave/stable-queue-4.1/build/arch/mips/kernel/genex.S:329: Error: absolute expression required `li $9,_IRQ_STACK_SIZE' _IRQ_STACK_SIZE is not defined. It was introduced with commit fe8bd18ffea53 ("MIPS: Introduce irq_stack"). No idea if that can be backported. Guenter

7 years, 8 months

2
2
0 0

[Linux-stable-mirror] [PATCH V2 1/6] blk-mq: quiesce queue before freeing queue

by Ming Lei

After queue is frozen, dispatch still may happen, for example: 1) requests are submitted from several contexts 2) requests from all these contexts are inserted to queue, but may dispatch to LLD in one of these paths, but other paths sill need to move on even all these requests are completed(that means blk_mq_freeze_queue_wait() returns at that time) 3) dispatch after queue freezing still moves on and causes use-after-free, because request queue is freed This patch quiesces queue after it is frozen, and makes sure all in-progress dispatch are completed. This patch fixes the following kernel crash when running heavy IOs vs. deleting device: [ 36.719251] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 36.720318] IP: kyber_has_work+0x14/0x40 [ 36.720847] PGD 254bf5067 P4D 254bf5067 PUD 255e6a067 PMD 0 [ 36.721584] Oops: 0000 [#1] PREEMPT SMP [ 36.722105] Dumping ftrace buffer: [ 36.722570] (ftrace buffer empty) [ 36.723057] Modules linked in: scsi_debug ebtable_filter ebtables ip6table_filter ip6_tables tcm_loop iscsi_target_mod target_core_file target_core_iblock target_core_pscsi target_core_mod xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c bridge stp llc fuse iptable_filter ip_tables sd_mod sg btrfs xor zstd_decompress zstd_compress xxhash raid6_pq mptsas mptscsih bcache crc32c_intel ahci mptbase libahci serio_raw scsi_transport_sas nvme libata shpchp lpc_ich virtio_scsi nvme_core binfmt_misc dm_mod iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi null_blk configs [ 36.733438] CPU: 2 PID: 2374 Comm: fio Not tainted 4.15.0-rc2.blk_mq_quiesce+ #714 [ 36.735143] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.9.3-1.fc25 04/01/2014 [ 36.736688] RIP: 0010:kyber_has_work+0x14/0x40 [ 36.737515] RSP: 0018:ffffc9000209bca0 EFLAGS: 00010202 [ 36.738431] RAX: 0000000000000008 RBX: ffff88025578bfc8 RCX: ffff880257bf4ed0 [ 36.739581] RDX: 0000000000000038 RSI: ffffffff81a98c6d RDI: ffff88025578bfc8 [ 36.740730] RBP: ffff880253cebfc8 R08: ffffc9000209bda0 R09: ffff8802554f3480 [ 36.741885] R10: ffffc9000209be60 R11: ffff880263f72538 R12: ffff88025573e9e8 [ 36.743036] R13: ffff88025578bfd0 R14: 0000000000000001 R15: 0000000000000000 [ 36.744189] FS: 00007f9b9bee67c0(0000) GS:ffff88027fc80000(0000) knlGS:0000000000000000 [ 36.746617] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 36.748483] CR2: 0000000000000008 CR3: 0000000254bf4001 CR4: 00000000003606e0 [ 36.750164] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 36.751455] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 36.752796] Call Trace: [ 36.753992] blk_mq_do_dispatch_sched+0x7f/0xe0 [ 36.755110] blk_mq_sched_dispatch_requests+0x119/0x190 [ 36.756179] __blk_mq_run_hw_queue+0x83/0x90 [ 36.757144] __blk_mq_delay_run_hw_queue+0xaf/0x110 [ 36.758046] blk_mq_run_hw_queue+0x24/0x70 [ 36.758845] blk_mq_flush_plug_list+0x1e7/0x270 [ 36.759676] blk_flush_plug_list+0xd6/0x240 [ 36.760463] blk_finish_plug+0x27/0x40 [ 36.761195] do_io_submit+0x19b/0x780 [ 36.761921] ? entry_SYSCALL_64_fastpath+0x1a/0x7d [ 36.762788] entry_SYSCALL_64_fastpath+0x1a/0x7d [ 36.763639] RIP: 0033:0x7f9b9699f697 [ 36.764352] RSP: 002b:00007ffc10f991b8 EFLAGS: 00000206 ORIG_RAX: 00000000000000d1 [ 36.765773] RAX: ffffffffffffffda RBX: 00000000008f6f00 RCX: 00007f9b9699f697 [ 36.766965] RDX: 0000000000a5e6c0 RSI: 0000000000000001 RDI: 00007f9b8462a000 [ 36.768377] RBP: 0000000000000000 R08: 0000000000000001 R09: 00000000008f6420 [ 36.769649] R10: 00007f9b846e5000 R11: 0000000000000206 R12: 00007f9b795d6a70 [ 36.770807] R13: 00007f9b795e4140 R14: 00007f9b795e3fe0 R15: 0000000100000000 [ 36.771955] Code: 83 c7 10 e9 3f 68 d1 ff 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 97 b0 00 00 00 48 8d 42 08 48 83 c2 38 <48> 3b 00 74 06 b8 01 00 00 00 c3 48 3b 40 08 75 f4 48 83 c0 10 [ 36.775004] RIP: kyber_has_work+0x14/0x40 RSP: ffffc9000209bca0 [ 36.776012] CR2: 0000000000000008 [ 36.776690] ---[ end trace 4045cbce364ff2a4 ]--- [ 36.777527] Kernel panic - not syncing: Fatal exception [ 36.778526] Dumping ftrace buffer: [ 36.779313] (ftrace buffer empty) [ 36.780081] Kernel Offset: disabled [ 36.780877] ---[ end Kernel panic - not syncing: Fatal exception Cc: stable(a)vger.kernel.org Signed-off-by: Ming Lei <ming.lei(a)redhat.com> --- block/blk-core.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/block/blk-core.c b/block/blk-core.c index b8881750a3ac..071a0fb29597 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -694,6 +694,15 @@ void blk_cleanup_queue(struct request_queue *q) queue_flag_set(QUEUE_FLAG_DEAD, q); spin_unlock_irq(lock); + /* + * make sure all in-progress dispatch are completed because + * blk_freeze_queue() can only complete all requests, and + * dispatch may still be in-progress since we dispatch requests + * from more than one contexts + */ + if (q->mq_ops) + blk_mq_quiesce_queue(q); + /* for synchronous bio-based driver finish in-flight integrity i/o */ blk_flush_integrity(); -- 2.9.5

7 years, 8 months

2
1
0 0

[Linux-stable-mirror] [PATCH 4.9 00/21] 4.9.73-stable review

by Greg Kroah-Hartman

This is the start of the stable review cycle for the 4.9.73 release. There are 21 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Fri Dec 29 16:45:43 UTC 2017. Anything received after that time might be too late. The whole patch series can be found in one patch at: kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.73-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below. thanks, greg k-h ------------- Pseudo-Shortlog of commits: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Linux 4.9.73-rc1 Yelena Krivosheev <yelena(a)marvell.com> net: mvneta: eliminate wrong call to handle rx descriptor error Yelena Krivosheev <yelena(a)marvell.com> net: mvneta: use proper rxq_number in loop on rx queues Yelena Krivosheev <yelena(a)marvell.com> net: mvneta: clear interface link status on port disable Dan Williams <dan.j.williams(a)intel.com> libnvdimm, pfn: fix start_pad handling for aligned namespaces Ravi Bangoria <ravi.bangoria(a)linux.vnet.ibm.com> powerpc/perf: Dereference BHRB entries safely Chen-Yu Tsai <wens(a)csie.org> clk: sunxi: sun9i-mmc: Implement reset callback for reset controls Paolo Bonzini <pbonzini(a)redhat.com> kvm: x86: fix RSM when PCID is non-zero Wanpeng Li <wanpeng.li(a)hotmail.com> KVM: X86: Fix load RFLAGS w/o the fixed bit Mika Westerberg <mika.westerberg(a)linux.intel.com> pinctrl: cherryview: Mask all interrupts on Intel_Strago based systems Ricardo Ribalda Delgado <ricardo.ribalda(a)gmail.com> spi: xilinx: Detect stall with Unknown commands Helge Deller <deller(a)gmx.de> parisc: Hide Diva-built-in serial aux and graphics card Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> PCI / PM: Force devices to D0 in pci_pm_thaw_noirq() Takashi Iwai <tiwai(a)suse.de> ALSA: usb-audio: Fix the missing ctl name suffix at parsing SU Jussi Laako <jussi(a)sonarnerd.net> ALSA: usb-audio: Add native DSD support for Esoteric D-05X Takashi Iwai <tiwai(a)suse.de> ALSA: rawmidi: Avoid racy info ioctl via ctl device Johan Hovold <johan(a)kernel.org> mfd: twl6040: Fix child-node lookup Johan Hovold <johan(a)kernel.org> mfd: twl4030-audio: Fix sibling-node lookup Jon Hunter <jonathanh(a)nvidia.com> mfd: cros ec: spi: Don't send first message too soon Sebastian Andrzej Siewior <bigeasy(a)linutronix.de> crypto: mcryptd - protect the per-CPU queue with a lock Dan Williams <dan.j.williams(a)intel.com> acpi, nfit: fix health event notification Takashi Iwai <tiwai(a)suse.de> ACPI: APEI / ERST: Fix missing error handling in erst_reader() ------------- Diffstat: Makefile | 4 ++-- arch/powerpc/perf/core-book3s.c | 8 ++++++-- arch/x86/kvm/emulate.c | 32 ++++++++++++++++++++++------- arch/x86/kvm/x86.c | 2 +- crypto/mcryptd.c | 23 +++++++++------------ drivers/acpi/apei/erst.c | 2 +- drivers/acpi/nfit/core.c | 9 +++++++- drivers/clk/sunxi/clk-sun9i-mmc.c | 12 +++++++++++ drivers/mfd/cros_ec_spi.c | 1 + drivers/mfd/twl4030-audio.c | 9 ++++++-- drivers/mfd/twl6040.c | 12 +++++++---- drivers/net/ethernet/marvell/mvneta.c | 8 ++++++-- drivers/nvdimm/pfn_devs.c | 5 +++-- drivers/parisc/lba_pci.c | 33 ++++++++++++++++++++++++++++++ drivers/pci/pci-driver.c | 7 ++++++- drivers/pinctrl/intel/pinctrl-cherryview.c | 16 +++++++++++++++ drivers/spi/spi-xilinx.c | 11 ++++++++++ include/crypto/mcryptd.h | 1 + sound/core/rawmidi.c | 15 +++++++++++--- sound/usb/mixer.c | 27 ++++++++++++++---------- sound/usb/quirks.c | 7 ++++--- 21 files changed, 189 insertions(+), 55 deletions(-)

7 years, 8 months

3
23
0 0

[Linux-stable-mirror] Solved 4.14 regression: USB support on A20/sun* (Cubietruck, BananaPi)

by Christoph Biedl

Just a heads-up to avoid more people lose hours in debugging: After upgrading from 4.9 to 4.14 I noticed USB support was broken on my Allwinner A20 boards like Cubietruck and first generation BananaPi. There is simply no output of the lsusb command, and subsequently no connected USB devices are detected. Bisecting led to commit 6254a6a94489c4be717f757bec7d3a372cba1b6e Author: Arnd Bergmann <arnd(a)arndb.de> Date: Thu Apr 27 21:11:48 2017 +0200 power: supply: axp20x_usb_power: add IIO dependency which already happened in the 4.12 development. Turns out after that commit "make oldconfig" will silently drop CONFIG_AXP20X_POWER unless CONFIG_IIO is set - but CONFIG_AXP20X_POWER is needed for USB on these boards. Solution: Enable CONFIG_IIO, but no particular driver is required. I'm not aware whether it's possible to provide a smoother upgrade path in Kconfig, "selects" instead of "depends" might have been an option. Christoph, two more 4.14 regressions to go

7 years, 8 months

2
1
0 0

[Linux-stable-mirror] + userfaultfd-clear-the-vma-vm_userfaultfd_ctx-if-uffd_event_fork-fails.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails has been added to the -mm tree. Its filename is userfaultfd-clear-the-vma-vm_userfaultfd_ctx-if-uffd_event_fork-fails.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/userfaultfd-clear-the-vma-vm_userf… and later at http://ozlabs.org/~akpm/mmotm/broken-out/userfaultfd-clear-the-vma-vm_userf… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Andrea Arcangeli <aarcange(a)redhat.com> Subject: userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails The previous fix 384632e67e0829 ("userfaultfd: non-cooperative: fix fork use after free") corrected the refcounting in case of UFFD_EVENT_FORK failure for the fork userfault paths. That still didn't clear the vma->vm_userfaultfd_ctx of the vmas that were set to point to the aborted new uffd ctx earlier in dup_userfaultfd. Link: http://lkml.kernel.org/r/20171223002505.593-2-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange(a)redhat.com> Reported-by: syzbot <syzkaller(a)googlegroups.com> Reviewed-by: Mike Rapoport <rppt(a)linux.vnet.ibm.com> Cc: Eric Biggers <ebiggers3(a)gmail.com> Cc: Dmitry Vyukov <dvyukov(a)google.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/userfaultfd.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff -puN fs/userfaultfd.c~userfaultfd-clear-the-vma-vm_userfaultfd_ctx-if-uffd_event_fork-fails fs/userfaultfd.c --- a/fs/userfaultfd.c~userfaultfd-clear-the-vma-vm_userfaultfd_ctx-if-uffd_event_fork-fails +++ a/fs/userfaultfd.c @@ -570,11 +570,14 @@ out: static void userfaultfd_event_wait_completion(struct userfaultfd_ctx *ctx, struct userfaultfd_wait_queue *ewq) { + struct userfaultfd_ctx *release_new_ctx; + if (WARN_ON_ONCE(current->flags & PF_EXITING)) goto out; ewq->ctx = ctx; init_waitqueue_entry(&ewq->wq, current); + release_new_ctx = NULL; spin_lock(&ctx->event_wqh.lock); /* @@ -601,8 +604,7 @@ static void userfaultfd_event_wait_compl new = (struct userfaultfd_ctx *) (unsigned long) ewq->msg.arg.reserved.reserved1; - - userfaultfd_ctx_put(new); + release_new_ctx = new; } break; } @@ -617,6 +619,20 @@ static void userfaultfd_event_wait_compl __set_current_state(TASK_RUNNING); spin_unlock(&ctx->event_wqh.lock); + if (release_new_ctx) { + struct vm_area_struct *vma; + struct mm_struct *mm = release_new_ctx->mm; + + /* the various vma->vm_userfaultfd_ctx still points to it */ + down_write(&mm->mmap_sem); + for (vma = mm->mmap; vma; vma = vma->vm_next) + if (vma->vm_userfaultfd_ctx.ctx == release_new_ctx) + vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; + up_write(&mm->mmap_sem); + + userfaultfd_ctx_put(release_new_ctx); + } + /* * ctx may go away after this if the userfault pseudo fd is * already released. _ Patches currently in -mm which might be from aarcange(a)redhat.com are userfaultfd-clear-the-vma-vm_userfaultfd_ctx-if-uffd_event_fork-fails.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH 3.2 00/94] 3.2.97-rc1 review

by Ben Hutchings

This is the start of the stable review cycle for the 3.2.97 release. There are 94 patches in this series, which will be posted as responses to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Mon Jan 1 17:00:00 UTC 2018. Anything received after that time might be too late. All the patches have also been committed to the linux-3.2.y-rc branch of https://git.kernel.org/pub/scm/linux/kernel/git/bwh/linux-stable-rc.git . A shortlog and diffstat can be found below. Ben. ------------- Al Viro (3): Bluetooth: bnep: bnep_add_connection() should verify that it's dealing with l2cap socket [71bb99a02b32b4cc4265118e85f6035ca72923f0] Bluetooth: cmtp: cmtp_add_connection() should verify that it's dealing with l2cap socket [96c26653ce65bf84f3212f8b00d4316c1efcbf4c] more bio_map_user_iov() leak fixes [2b04e8f6bbb196cab4b232af0f8d48ff2c7a8058] Alan Stern (8): USB: core: prevent malicious bNumInterfaces overflow [48a4ff1c7bb5a32d2e396b03132d20d552c0eca7] USB: dummy-hcd: Fix deadlock caused by disconnect detection [ab219221a5064abfff9f78c323c4a257b16cdb81] USB: dummy-hcd: Fix erroneous synchronization change [7dbd8f4cabd96db5a50513de9d83a8105a5ffc81] USB: dummy-hcd: fix infinite-loop resubmission bug [0173a68bfb0ad1c72a6ee39cc485aa2c97540b98] USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks [f16443a034c7aa359ddf6f0f9bc40d01ca31faea] USB: gadgetfs: Fix crash caused by inadequate synchronization [520b72fc64debf8a86c3853b8e486aa5982188f0] USB: gadgetfs: fix copy_to_user while holding spinlock [6e76c01e71551cb221c1f3deacb9dcd9a7346784] usb-storage: unusual_devs entry to fix write-access regression for Seagate external drives [113f6eb6d50cfa5e2a1cdcf1678b12661fa272ab] Andreas Engel (1): USB: serial: cp210x: add support for ELV TFD500 [c496ad835c31ad639b6865714270b3003df031f6] Andreas Gruenbacher (1): vfs: Return -ENXIO for negative SEEK_HOLE / SEEK_DATA offsets [fc46820b27a2d9a46f7e90c9ceb4a64a1bc5fab8] Andrew Honig (1): KVM: VMX: remove I/O port 0x80 bypass on Intel hosts [d59d51f088014f25c2562de59b9abff4f42a7468] Andrey Konovalov (2): uwb: ensure that endpoint is interrupt [70e743e4cec3733dc13559f6184b35d358b9ef3f] uwb: properly check kthread_run return value [bbf26183b7a6236ba602f4d6a2f7cade35bba043] Ashish Samant (1): ocfs2: fstrim: Fix start offset of first cluster group during fstrim [105ddc93f06ebe3e553f58563d11ed63dbcd59f0] Baruch Siach (1): spi: uapi: spidev: add missing ioctl header [a2b4a79b88b24c49d98d45a06a014ffd22ada1a4] Ben Hutchings (2): ipsec: Fix aborted xfrm policy dump crash [1137b5e2529a8f5ca8ee709288ecba3e68044df2] security: Fix mode test in selinux_ptrace_access_check() [69f594a38967f4540ce7a29b3fd214e68a8330bd] Bin Liu (1): usb: gadget: fix spinlock dead lock in gadgetfs [d246dcb2331c5783743720e6510892eb1d2801d9] Boqun Feng (2): kvm/x86: Avoid async PF preempting the kernel incorrectly [a2b7861bb33b2538420bb5d8554153484d3f961f] kvm/x86: Handle async PF in RCU read-side critical sections [b862789aa5186d5ea3a024b7cfe0f80c3a38b980] Borislav Petkov (1): x86/oprofile/ppro: Do not use __this_cpu*() in preemptible context [a743bbeef27b9176987ec0cb7f906ab0ab52d1da] Casey Schaufler (1): lsm: fix smack_inode_removexattr and xattr_getsecurity memleak [57e7ba04d422c3d41c8426380303ec9b7533ded9] Colin Ian King (1): staging: iio: ade7759: fix signed extension bug on shift of a u8 [13ffe9a26df4e156363579b25c904dd0b1e31bfb] Craig Gallek (1): tun/tap: sanitize TUNSETSNDBUF input [93161922c658c714715686cd0cf69b090cb9bf1d] Dan Carpenter (1): tile: array underflow in setup_maxnodemem() [637f23abca87d26e091e0d6647ec878d97d2c6cd] David Herrmann (1): Bluetooth: hidp: verify l2cap sockets [b3916db32c4a3124eee9f3742a2f4723731d7602] Dmitry Fleytman (1): usb: Increase quirk delay for USB devices [b2a542bbb3081dbd64acc8929c140d196664c406] Dmitry Torokhov (1): Input: uinput - avoid FF flush when destroying device [e8b95728f724797f958912fd9b765a695595d3a6] Eric Biggers (13): FS-Cache: fix dereference of NULL user_key_payload [d124b2c53c7bee6569d2a2d0b18b4a1afde00134] KEYS: add missing permission check for request_key() destination [4dca6ea1d9432052afb06baf2e3ae78188a4410b] KEYS: don't revoke uninstantiated key in request_key_auth_new() [f7b48cf08fa63a68b59c2894806ee478216d7f91] KEYS: encrypted: fix dereference of NULL user_key_payload [13923d0865ca96312197962522e88bc0aedccd74] KEYS: fix cred refcount leak in request_key_auth_new() [44d8143340a99b167c74365e844516b73523c087] KEYS: fix key refcount leak in keyctl_assume_authority() [884bee0215fcc239b30c062c37ca29077005e064] KEYS: fix key refcount leak in keyctl_read_key() [7fc0786d956d9e59b68d282be9b156179846ea3d] KEYS: prevent creating a different user's keyrings [237bbd29f7a049d310d907f4b2716a7feef9abf3] KEYS: trusted: fix writing past end of buffer in trusted_read() [a3c812f7cfd80cf51e8f5b7034f7418f6beb56c1] KEYS: trusted: sanitize all key material [ee618b4619b72527aaed765f0f0b74072b281159] crypto: hmac - require that the underlying hash algorithm is unkeyed [af3ff8045bbf3e32f1a448542e73abb4c8ceb6f1] crypto: salsa20 - fix blkcipher_walk API usage [ecaaab5649781c5a0effdaf298a925063020500e] ecryptfs: fix dereference of NULL user_key_payload [f66665c09ab489a11ca490d6a82df57cfc1bea3e] Eric Dumazet (1): tcp: fix tcp_mtu_probe() vs highest_sack [2b7cda9c35d3b940eb9ce74b30bbd5eb30db493d] Ethan Zhao (1): sched/sysctl: Check user input value of sysctl_sched_time_avg [5ccba44ba118a5000cccc50076b0344632459779] Felipe Balbi (1): usb: quirks: add quirk for WORLDE MINI MIDI keyboard [2811501e6d8f5747d08f8e25b9ecf472d0dc4c7d] Geert Uytterhoeven (2): sh: sh7722: remove nonexistent GPIO_PTQ7 to fix pinctrl registration [b78412b8300a8453b78d2c1b0b925b66493bb011] sh: sh7757: remove nonexistent GPIO_PT[JLNQ]7_RESV to fix pinctrl registration [d8ce38f69843a56da044e56b6c16aecfbc3c6e39] Gerald Schaefer (1): s390/mm: fix write access check in gup_huge_pmd() [ba385c0594e723d41790ecfb12c610e6f90c7785] Gleb Natapov (1): KVM: Do not take reference to mm during async #PF [62c49cc976af84cb0ffcb5ec07ee88da1a94e222] Guillaume Nault (6): l2tp: check ps->sock before running pppol2tp_session_ioctl() [5903f594935a3841137c86b9d5b75143a5b7121c] l2tp: don't use l2tp_tunnel_find() in l2tp_ip and l2tp_ip6 [8f7dc9ae4a7aece9fbc3e6637bdfa38b36bcdf09] l2tp: fix l2tp_eth module loading [9f775ead5e570e7e19015b9e4e2f3dd6e71a5935] l2tp: hold socket before dropping lock in l2tp_ip{, 6}_recv() [a3c18422a4b4e108bcf6a2328f48867e1003fd95] l2tp: hold tunnel in pppol2tp_connect() [f9e56baf03f9d36043a78f16e3e8b2cfd211e09e] l2tp: hold tunnel socket when handling control frames in l2tp_ip and l2tp_ip6 [94d7ee0baa8b764cf64ad91ed69464c1a6a0066b] Haozhong Zhang (1): KVM: nVMX: fix guest CR4 loading when emulating L2 to L1 exit [8eb3f87d903168bdbd1222776a6b1e281f50513e] Henryk Heisig (1): USB: serial: option: add support for TP-Link LTE module [837ddc4793a69b256ac5e781a5e729b448a8d983] Herbert Xu (1): crypto: shash - Fix zero-length shash ahash digest crash [b61907bb42409adf9b3120f741af7c57dd7e3db2] Jean Delvare (1): kernel/params.c: align add_sysfs_param documentation with code [630cc2b30a42c70628368a412beb4a5e5dd71abe] Jeffrey Chu (1): USB: serial: ftdi_sio: add id for Cypress WICED dev board [a6c215e21b0dc5fe9416dce90f9acc2ea53c4502] Jim Dickerson (1): usb: pci-quirks.c: Corrected timeout values used in handshake [114ec3a6f9096d211a4aff4277793ba969a62c73] Joerg Roedel (1): iommu/amd: Finish TLB flush in amd_iommu_unmap() [ce76353f169a6471542d999baf3d29b121dce9c0] Johannes Thumshirn (1): scsi: libiscsi: fix shifting of DID_REQUEUE host byte [eef9ffdf9cd39b2986367bc8395e2772bc1284ba] Kazuya Mizuguchi (1): usb: renesas_usbhs: Fix DMAC sequence for receiving zero-length packet [29c7f3e68eec4ae94d85ad7b5dfdafdb8089f513] Konstantin Khlebnikov (1): Smack: remove unneeded NULL-termination from securtity label [da1b63566c469bf3e2b24182114422e16b1aa34c] LEROY Christophe (1): crypto: talitos - fix sha224 [afd62fa26343be6445479e75de9f07092a061459] Maksim Salau (1): usb: cdc_acm: Add quirk for Elatec TWN3 [765fb2f181cad669f2beb87842a05d8071f2be85] Mark Rutland (1): ARM: 8720/1: ensure dump_instr() checks addr_limit [b9dd05c7002ee0ca8b676428b2268c26399b5e31] Martin K. Petersen (1): scsi: sd: Implement blacklist option for WRITE SAME w/ UNMAP [28a0bc4120d38a394499382ba21d6965a67a3703] Mathias Nyman (1): xhci: fix finding correct bus_state structure for USB 3.1 hosts [5a838a13c9b4e5dd188b7a6eaeb894e9358ead0c] Michael S. Tsirkin (1): macvtap: fix TUNSETSNDBUF values > 64k [3ea79249e81e5ed051f2e6480cbde896d99046e8] Mohamed Ghannam (1): dccp: CVE-2017-8824: use-after-free in DCCP code [69c64866ce072dea1d1e59a0d61e0f66c0dffb76] Nicolas Dichtel (1): net: enable interface alias removal via rtnl [2459b4c635858094df78abb9ca87d99f89fe8ca5] Oleg Nesterov (1): ptrace: change __ptrace_unlink() to clear ->ptrace under ->siglock [1333ab03150478df8d6f5673a91df1e50dc6ab97] Oswald Buddenhagen (1): MIPS: AR7: Ensure that serial ports are properly set up [b084116f8587b222a2c5ef6dcd846f40f24b9420] Stefan Mätje (1): can: esd_usb2: Fix can_dlc value for received RTR, frames [72d92e865d1560723e1957ee3f393688c49ca5bf] Stefano Brivio (1): scsi: lpfc: Don't return internal MBXERR_ERROR code from probe function [5c756065e47dc3e84b00577bd109f0a8e69903d7] Steffen Maier (1): scsi: zfcp: fix erp_action use-before-initialize in REC action trace [ab31fd0ce65ec93828b617123792c1bb7c6dcc42] Takashi Iwai (9): ALSA: caiaq: Fix stray URB at probe error path [99fee508245825765ff60155fed43f970ff83a8f] ALSA: seq: Avoid invalid lockdep class warning [3510c7aa069aa83a2de6dab2b41401a198317bdc] ALSA: seq: Fix OSS sysex delivery in OSS emulation [132d358b183ac6ad8b3fea32ad5e0663456d18d1] ALSA: seq: Fix copy_from_user() call inside lock [5803b023881857db32ffefa0d269c90280a67ee0] ALSA: seq: Fix nested rwsem annotation for lockdep splat [1f20f9ff57ca23b9f5502fca85ce3977e8496cb1] ALSA: timer: Add missing mutex lock for compat ioctls [79fb0518fec8c8b4ea7f1729f54f293724b3dbb0] ALSA: timer: Limit max instances per timer [9b7d869ee5a77ed4a462372bb89af622e705bfb8] ALSA: timer: Protect the whole snd_timer_close() with open race [9984d1b5835ca29fc7025186a891ee7398d21cc7] ALSA: usx2y: Suppress kernel warning at page allocation failures [7682e399485fe19622b6fd82510b1f4551e48a25] Wanpeng Li (1): KVM: Fix stack-out-of-bounds read in write_mmio [e39d200fa5bf5b94a0948db0dae44c1b73b84a56] Willem de Bruijn (1): packet: only test po->has_vnet_hdr once in packet_snd [da7c9561015e93d10fe6aab73e9288e0d09d65a6] Xin Long (1): sctp: fix a type cast warnings that causes a_rwnd gets the wrong value [f6fc6bc0b8e0bb13a210bd7386ffdcb1a5f30ef1] Yoshihiro Shimoda (2): usb: renesas_usbhs: fix the BCLR setting condition for non-DCP pipe [6124607acc88fffeaadf3aacfeb3cc1304c87387] usb: renesas_usbhs: fix usbhsf_fifo_clear() for RX direction [0a2ce62b61f2c76d0213edf4e37aaf54a8ddf295] Makefile | 4 +- arch/arm/kernel/traps.c | 28 ++++++--- arch/mips/ar7/platform.c | 1 + arch/s390/mm/gup.c | 7 +-- arch/sh/include/cpu-sh4/cpu/sh7722.h | 2 +- arch/sh/include/cpu-sh4/cpu/sh7757.h | 8 +-- arch/tile/kernel/setup.c | 3 +- arch/x86/crypto/salsa20_glue.c | 7 --- arch/x86/include/asm/kvm_para.h | 4 +- arch/x86/kernel/kvm.c | 22 +++---- arch/x86/kvm/svm.c | 2 +- arch/x86/kvm/vmx.c | 7 +-- arch/x86/kvm/x86.c | 8 +-- arch/x86/oprofile/op_model_ppro.c | 4 +- crypto/hmac.c | 6 +- crypto/salsa20_generic.c | 7 --- crypto/shash.c | 13 ++-- drivers/crypto/talitos.c | 4 +- drivers/input/ff-core.c | 13 +++- drivers/input/misc/uinput.c | 18 ++++++ drivers/iommu/amd_iommu.c | 1 + drivers/net/can/usb/esd_usb2.c | 2 +- drivers/net/macvtap.c | 6 +- drivers/net/tun.c | 4 ++ drivers/s390/scsi/zfcp_aux.c | 5 ++ drivers/s390/scsi/zfcp_erp.c | 18 +++--- drivers/s390/scsi/zfcp_scsi.c | 5 ++ drivers/scsi/libiscsi.c | 2 +- drivers/scsi/lpfc/lpfc_init.c | 1 + drivers/scsi/scsi_scan.c | 3 + drivers/scsi/sd.c | 14 ++++- drivers/staging/iio/meter/ade7759.c | 2 +- drivers/usb/class/cdc-acm.c | 3 + drivers/usb/core/config.c | 6 +- drivers/usb/core/hub.c | 2 +- drivers/usb/core/quirks.c | 4 ++ drivers/usb/gadget/dummy_hcd.c | 56 +++++++++++++---- drivers/usb/gadget/inode.c | 72 ++++++++++++++++----- drivers/usb/host/pci-quirks.c | 8 +-- drivers/usb/host/xhci.h | 2 +- drivers/usb/renesas_usbhs/fifo.c | 23 +++++-- drivers/usb/serial/cp210x.c | 1 + drivers/usb/serial/ftdi_sio.c | 2 + drivers/usb/serial/ftdi_sio_ids.h | 7 +++ drivers/usb/serial/option.c | 2 + drivers/usb/storage/unusual_devs.h | 7 +++ drivers/uwb/hwa-rc.c | 2 + drivers/uwb/uwbd.c | 12 ++-- fs/bio.c | 14 +++-- fs/ecryptfs/ecryptfs_kernel.h | 25 +++++--- fs/ecryptfs/keystore.c | 9 ++- fs/fscache/object-list.c | 7 +++ fs/ocfs2/alloc.c | 24 +++++-- fs/read_write.c | 4 +- fs/xattr.c | 2 +- include/crypto/internal/hash.h | 8 +++ include/linux/input.h | 1 + include/linux/key.h | 2 + include/linux/spi/spidev.h | 1 + include/net/bluetooth/l2cap.h | 1 + include/net/tcp.h | 6 +- include/scsi/scsi_device.h | 1 + include/scsi/scsi_devinfo.h | 1 + include/sound/seq_kernel.h | 3 +- include/sound/seq_virmidi.h | 1 + include/sound/timer.h | 2 + include/trace/events/kvm.h | 7 ++- kernel/params.c | 2 +- kernel/ptrace.c | 3 +- kernel/sysctl.c | 3 +- net/bluetooth/bnep/core.c | 3 + net/bluetooth/cmtp/core.c | 3 + net/bluetooth/hidp/core.c | 2 + net/bluetooth/l2cap_sock.c | 6 ++ net/core/rtnetlink.c | 5 +- net/dccp/proto.c | 5 ++ net/ipv4/tcp_output.c | 3 +- net/l2tp/l2tp_eth.c | 51 +-------------- net/l2tp/l2tp_ip.c | 20 +++--- net/l2tp/l2tp_ppp.c | 10 ++- net/packet/af_packet.c | 4 +- net/sctp/sm_sideeffect.c | 4 +- net/xfrm/xfrm_user.c | 3 +- security/keys/encrypted-keys/encrypted.c | 7 +++ security/keys/internal.h | 2 +- security/keys/key.c | 2 + security/keys/keyctl.c | 8 +-- security/keys/keyring.c | 23 ++++--- security/keys/process_keys.c | 9 ++- security/keys/request_key.c | 46 +++++++++++--- security/keys/request_key_auth.c | 69 ++++++++++----------- security/keys/trusted.c | 71 ++++++++++----------- security/selinux/hooks.c | 2 +- security/smack/smack_lsm.c | 57 ++++++++--------- sound/core/hrtimer.c | 1 + sound/core/seq/oss/seq_oss_midi.c | 4 +- sound/core/seq/oss/seq_oss_readq.c | 29 +++++++++ sound/core/seq/oss/seq_oss_readq.h | 2 + sound/core/seq/seq_clientmgr.c | 2 +- sound/core/seq/seq_virmidi.c | 27 +++++--- sound/core/timer.c | 103 +++++++++++++++++++++---------- sound/core/timer_compat.c | 17 ++++- sound/usb/caiaq/device.c | 12 +++- sound/usb/usx2y/usb_stream.c | 6 +- 104 files changed, 769 insertions(+), 416 deletions(-) -- Ben Hutchings The two most common things in the universe are hydrogen and stupidity.

7 years, 8 months

2
95
0 0

[Linux-stable-mirror] [PATCH 3.16 000/204] 3.16.52-rc1 review

by Ben Hutchings

This is the start of the stable review cycle for the 3.16.52 release. There are 204 patches in this series, which will be posted as responses to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Mon Jan 1 17:00:00 UTC 2018. Anything received after that time might be too late. All the patches have also been committed to the linux-3.16.y-rc branch of https://git.kernel.org/pub/scm/linux/kernel/git/bwh/linux-stable-rc.git . A shortlog and diffstat can be found below. Ben. ------------- Adrian Salido (1): HID: i2c-hid: allocate hid buffers for real worst case [8320caeeffdefec3b58b9d4a7ed8e1079492fe7b] Al Viro (3): Bluetooth: bnep: bnep_add_connection() should verify that it's dealing with l2cap socket [71bb99a02b32b4cc4265118e85f6035ca72923f0] Bluetooth: cmtp: cmtp_add_connection() should verify that it's dealing with l2cap socket [96c26653ce65bf84f3212f8b00d4316c1efcbf4c] more bio_map_user_iov() leak fixes [2b04e8f6bbb196cab4b232af0f8d48ff2c7a8058] Alan Stern (11): USB: core: prevent malicious bNumInterfaces overflow [48a4ff1c7bb5a32d2e396b03132d20d552c0eca7] USB: dummy-hcd: Fix deadlock caused by disconnect detection [ab219221a5064abfff9f78c323c4a257b16cdb81] USB: dummy-hcd: Fix erroneous synchronization change [7dbd8f4cabd96db5a50513de9d83a8105a5ffc81] USB: dummy-hcd: fix connection failures (wrong speed) [fe659bcc9b173bcfdd958ce2aec75e47651e74e1] USB: dummy-hcd: fix infinite-loop resubmission bug [0173a68bfb0ad1c72a6ee39cc485aa2c97540b98] USB: g_mass_storage: Fix deadlock when driver is unbound [1fbbb78f25d1291274f320462bf6908906f538db] USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks [f16443a034c7aa359ddf6f0f9bc40d01ca31faea] USB: gadgetfs: Fix crash caused by inadequate synchronization [520b72fc64debf8a86c3853b8e486aa5982188f0] USB: gadgetfs: fix copy_to_user while holding spinlock [6e76c01e71551cb221c1f3deacb9dcd9a7346784] usb-storage: fix bogus hardware error messages for ATA pass-thru devices [a4fd4a724d6c30ad671046d83be2e9be2f11d275] usb-storage: unusual_devs entry to fix write-access regression for Seagate external drives [113f6eb6d50cfa5e2a1cdcf1678b12661fa272ab] Alex Estrin (1): Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0" [612601d0013f03de9dc134809f242ba6da9ca252] Alexey Kodanev (1): vti: fix use after free in vti_tunnel_xmit/vti6_tnl_xmit [36f6ee22d2d66046e369757ec6bbe1c482957ba6] Andreas Engel (1): USB: serial: cp210x: add support for ELV TFD500 [c496ad835c31ad639b6865714270b3003df031f6] Andreas Gruenbacher (2): direct-io: Prevent NULL pointer access in submit_page_section [899f0429c7d3eed886406cd72182bee3b96aa1f9] vfs: Return -ENXIO for negative SEEK_HOLE / SEEK_DATA offsets [fc46820b27a2d9a46f7e90c9ceb4a64a1bc5fab8] Andrei Vagin (1): net/unix: don't show information about sockets from other namespaces [0f5da659d8f1810f44de14acf2c80cd6499623a0] Andrew Gabbasov (1): usb: gadget: composite: Fix use-after-free in usb_composite_overwrite_options [aec17e1e249567e82b26dafbb86de7d07fde8729] Andrew Honig (1): KVM: VMX: remove I/O port 0x80 bypass on Intel hosts [d59d51f088014f25c2562de59b9abff4f42a7468] Andrey Konovalov (2): uwb: ensure that endpoint is interrupt [70e743e4cec3733dc13559f6184b35d358b9ef3f] uwb: properly check kthread_run return value [bbf26183b7a6236ba602f4d6a2f7cade35bba043] Aravind Gopalakrishnan (2): pci_ids: Add PCI device IDs for F15h M60h [4cbbdb51cc921f95978360fd7a0652d493dadc3e] x86, amd_nb: Add device IDs to NB tables for F15h M60h [15895a729e02ea55433b912cc31d5c6de16359ec] Arnd Bergmann (4): ARM: 8715/1: add a private asm/unaligned.h [1cce91dfc8f7990ca3aea896bfb148f240b12860] gpio: acpi: work around false-positive -Wstring-overflow warning [e40a3ae1f794a35c4af3746291ed6fedc1fa0f6f] include/linux/of.h: provide of_n_{addr,size}_cells wrappers for !CONFIG_OF [8a1ac5dc7be09883051b1bf89a5e57d7ad850fa5] usb: gadget: dummy: fix nonsensical comparisons [7661ca09b2ff98f48693f431bb01fed62830e433] Ashish Samant (1): ocfs2: fstrim: Fix start offset of first cluster group during fstrim [105ddc93f06ebe3e553f58563d11ed63dbcd59f0] Baruch Siach (1): spi: uapi: spidev: add missing ioctl header [a2b4a79b88b24c49d98d45a06a014ffd22ada1a4] Ben Hutchings (1): ipsec: Fix aborted xfrm policy dump crash [1137b5e2529a8f5ca8ee709288ecba3e68044df2] Bo Yan (1): tracing: Erase irqsoff trace with empty write [8dd33bcb7050dd6f8c1432732f930932c9d3a33e] Borislav Petkov (3): x86/cpu/AMD: Apply the Erratum 688 fix when the BIOS doesn't [bfc1168de949cd3e9ca18c3480b5085deff1ea7c] x86/microcode/intel: Disable late loading on model 79 [723f2828a98c8ca19842042f418fb30dd8cfc0f7] x86/oprofile/ppro: Do not use __this_cpu*() in preemptible context [a743bbeef27b9176987ec0cb7f906ab0ab52d1da] Casey Schaufler (1): lsm: fix smack_inode_removexattr and xattr_getsecurity memleak [57e7ba04d422c3d41c8426380303ec9b7533ded9] Christophe Jaillet (1): IB/mlx5: Fix the size parameter to find_first_bit [fffd68734dc685e208e86d8c5f6522cd695a8d60] Colin Ian King (2): IB/ocrdma: fix incorrect fall-through on switch statement [06564f60859bdf7e73d70ae35d7e285e96ae9c46] staging: iio: ade7759: fix signed extension bug on shift of a u8 [13ffe9a26df4e156363579b25c904dd0b1e31bfb] Cong Wang (2): tun: call dev_get_valid_name() before register_netdevice() [0ad646c81b2182f7fa67ec0c8c825e0ee165696d] vlan: fix a use-after-free in vlan_device_event() [052d41c01b3a2e3371d66de569717353af489d63] Craig Gallek (1): tun/tap: sanitize TUNSETSNDBUF input [93161922c658c714715686cd0cf69b090cb9bf1d] Dan Carpenter (1): tile: array underflow in setup_maxnodemem() [637f23abca87d26e091e0d6647ec878d97d2c6cd] David Disseldorp (2): SMB: fix leak of validate negotiate info response buffer [fe83bebc05228e838ed5cbbc62712ab50dd40e18] SMB: fix validate negotiate info uninitialised memory use [a2d9daad1d2dfbd307ab158044d1c323d7babbde] Dmitry Fleytman (1): usb: Increase quirk delay for USB devices [b2a542bbb3081dbd64acc8929c140d196664c406] Dmitry Torokhov (3): Input: ims-psu - check if CDC union descriptor is sane [ea04efee7635c9120d015dcdeeeb6988130cb67a] Input: uinput - avoid FF flush when destroying device [e8b95728f724797f958912fd9b765a695595d3a6] Input: uinput - avoid crash when sending FF request to device going away [6b4877c7bdc6ae39ce03716df7caeecf204697eb] Dongjiu Geng (1): arm/arm64: KVM: set right LR register value for 32 bit guest when inject abort [fd6c8c206fc5d0717b0433b191de0715122f33bb] Dragos Bogdan (2): iio: ad7793: Fix the serial interface reset [7ee3b7ebcb74714df6d94c8f500f307e1ee5dda5] iio: ad_sigma_delta: Implement a dedicated reset function [7fc10de8d49a748c476532c9d8e8fe19e548dd67] Eric Biggers (18): FS-Cache: fix dereference of NULL user_key_payload [d124b2c53c7bee6569d2a2d0b18b4a1afde00134] KEYS: add missing permission check for request_key() destination [4dca6ea1d9432052afb06baf2e3ae78188a4410b] KEYS: don't revoke uninstantiated key in request_key_auth_new() [f7b48cf08fa63a68b59c2894806ee478216d7f91] KEYS: encrypted: fix dereference of NULL user_key_payload [13923d0865ca96312197962522e88bc0aedccd74] KEYS: fix NULL pointer dereference during ASN.1 parsing [ver #2] [624f5ab8720b3371367327a822c267699c1823b8] KEYS: fix cred refcount leak in request_key_auth_new() [44d8143340a99b167c74365e844516b73523c087] KEYS: fix key refcount leak in keyctl_assume_authority() [884bee0215fcc239b30c062c37ca29077005e064] KEYS: fix key refcount leak in keyctl_read_key() [7fc0786d956d9e59b68d282be9b156179846ea3d] KEYS: fix out-of-bounds read during ASN.1 parsing [2eb9eabf1e868fda15808954fb29b0f105ed65f1] KEYS: fix writing past end of user-supplied buffer in keyring_read() [e645016abc803dafc75e4b8f6e4118f088900ffb] KEYS: prevent creating a different user's keyrings [237bbd29f7a049d310d907f4b2716a7feef9abf3] KEYS: return full count in keyring_read() if buffer is too small [3239b6f29bdfb4b0a2ba59df995fc9e6f4df7f1f] KEYS: trusted: fix writing past end of buffer in trusted_read() [a3c812f7cfd80cf51e8f5b7034f7418f6beb56c1] KEYS: trusted: sanitize all key material [ee618b4619b72527aaed765f0f0b74072b281159] crypto: hmac - require that the underlying hash algorithm is unkeyed [af3ff8045bbf3e32f1a448542e73abb4c8ceb6f1] crypto: salsa20 - fix blkcipher_walk API usage [ecaaab5649781c5a0effdaf298a925063020500e] ecryptfs: fix dereference of NULL user_key_payload [f66665c09ab489a11ca490d6a82df57cfc1bea3e] lib/digsig: fix dereference of NULL user_key_payload [192cabd6a296cbc57b3d8c05c4c89d87fc102506] Eric Dumazet (3): netfilter: x_tables: avoid stack-out-of-bounds read in xt_copy_counters_from_user [e466af75c074e76107ae1cd5a2823e9c61894ffb] tcp: fastopen: fix on syn-data transmit failure [b5b7db8d680464b1d631fd016f5e093419f0bfd9] tcp: fix tcp_mtu_probe() vs highest_sack [2b7cda9c35d3b940eb9ce74b30bbd5eb30db493d] Eric W. Biederman (5): exec: Ensure mm->user_ns contains the execed files [f84df2a6f268de584a201e8911384a2d244876e3] mm: Add a user_ns owner to mm_struct and fix ptrace permission checks [bfedb589252c01fa505ac9f6f2a3d5d68d707ef4] ptrace: Capture the ptracer's creds not PT_PTRACE_CAP [64b875f7ac8a5d60a4e191479299e931ee949b67] ptrace: Don't allow accessing an undumpable mm [84d77d3f06e7e8dea057d10e8ec77ad71f721be3] ptrace: Properly initialize ptracer_cred on fork [c70d9d809fdeecedb96972457ee45c49a232d97f] Ethan Zhao (1): sched/sysctl: Check user input value of sysctl_sched_time_avg [5ccba44ba118a5000cccc50076b0344632459779] Felipe Balbi (1): usb: quirks: add quirk for WORLDE MINI MIDI keyboard [2811501e6d8f5747d08f8e25b9ecf472d0dc4c7d] Florian Westphal (1): netfilter: ipset: pernet ops must be unregistered last [e23ed762db7ed1950a6408c3be80bc56909ab3d4] Geert Uytterhoeven (4): sh: sh7264: remove nonexistent GPIO_PH[0-7] to fix pinctrl registration [eae3df7e82318d798f45dedf111e241805ec7a4a] sh: sh7269: remove nonexistent GPIO_PH[0-7] to fix pinctrl registration [d9d73e81fe82fdf4ee65a48c26531edc04108349] sh: sh7722: remove nonexistent GPIO_PTQ7 to fix pinctrl registration [b78412b8300a8453b78d2c1b0b925b66493bb011] sh: sh7757: remove nonexistent GPIO_PT[JLNQ]7_RESV to fix pinctrl registration [d8ce38f69843a56da044e56b6c16aecfbc3c6e39] Gerald Schaefer (1): s390/mm: fix write access check in gup_huge_pmd() [ba385c0594e723d41790ecfb12c610e6f90c7785] Guillaume Nault (6): l2tp: check ps->sock before running pppol2tp_session_ioctl() [5903f594935a3841137c86b9d5b75143a5b7121c] l2tp: don't use l2tp_tunnel_find() in l2tp_ip and l2tp_ip6 [8f7dc9ae4a7aece9fbc3e6637bdfa38b36bcdf09] l2tp: fix l2tp_eth module loading [9f775ead5e570e7e19015b9e4e2f3dd6e71a5935] l2tp: hold socket before dropping lock in l2tp_ip{, 6}_recv() [a3c18422a4b4e108bcf6a2328f48867e1003fd95] l2tp: hold tunnel in pppol2tp_connect() [f9e56baf03f9d36043a78f16e3e8b2cfd211e09e] l2tp: hold tunnel socket when handling control frames in l2tp_ip and l2tp_ip6 [94d7ee0baa8b764cf64ad91ed69464c1a6a0066b] Gustavo A. R. Silva (1): MIPS: microMIPS: Fix incorrect mask in insn_table_MM [77238e76b9156d28d86c1e31c00ed2960df0e4de] Hante Meuleman (1): brcmfmac: Add length checks on firmware events [0aedbcaf6f182690790d98d90d5fe1e64c846c34] Haozhong Zhang (1): KVM: nVMX: fix guest CR4 loading when emulating L2 to L1 exit [8eb3f87d903168bdbd1222776a6b1e281f50513e] Henryk Heisig (1): USB: serial: option: add support for TP-Link LTE module [837ddc4793a69b256ac5e781a5e729b448a8d983] Herbert Xu (1): crypto: shash - Fix zero-length shash ahash digest crash [b61907bb42409adf9b3120f741af7c57dd7e3db2] Ilya Dryomov (1): rbd: use GFP_NOIO for parent stat and data requests [1e37f2f84680fa7f8394fd444b6928e334495ccc] Ilya Lesokhin (1): IB/mlx5: Simplify mlx5_ib_cont_pages [d67bc5d4e3e100d762c0f57ea67f28bc219698a6] Jan Luebbe (1): bus: mbus: fix window size calculation for 4GB windows [2bbbd96357ce76cc45ec722c00f654aa7b189112] Jani Nikula (1): drm/i915/bios: ignore HDMI on port A [d27ffc1d00327c29b3aa97f941b42f0949f9e99f] Jann Horn (1): security: let security modules use PTRACE_MODE_* with bitmasks [3dfb7d8cdbc7ea0c2970450e60818bb3eefbad69] Jason A. Donenfeld (1): security/keys: properly zero out sensitive key material in big_key [910801809b2e40a4baedd080ef5d80b4a180e70e] Jean Delvare (1): kernel/params.c: align add_sysfs_param documentation with code [630cc2b30a42c70628368a412beb4a5e5dd71abe] Jeff Lance (1): Input: ti_am335x_tsc - fix incorrect step config for 5 wire touchscreen [cf5dd48907bebaefdb43a8ca079be77e8da2cb20] Jeffrey Chu (1): USB: serial: ftdi_sio: add id for Cypress WICED dev board [a6c215e21b0dc5fe9416dce90f9acc2ea53c4502] Jim Dickerson (1): usb: pci-quirks.c: Corrected timeout values used in handshake [114ec3a6f9096d211a4aff4277793ba969a62c73] Jimmy Assarsson (1): can: kvaser_usb: Correct return value in printout [8f65a923e6b628e187d5e791cf49393dd5e8c2f9] Joerg Roedel (1): iommu/amd: Finish TLB flush in amd_iommu_unmap() [ce76353f169a6471542d999baf3d29b121dce9c0] Johan Hovold (1): USB: serial: metro-usb: add MS7820 device id [31dc3f819bac28a0990b36510197560258ab7421] Johannes Thumshirn (1): scsi: libiscsi: fix shifting of DID_REQUEUE host byte [eef9ffdf9cd39b2986367bc8395e2772bc1284ba] John David Anglin (1): parisc: Fix double-word compare and exchange in LWS code on 32-bit kernels [374b3bf8e8b519f61eb9775888074c6e46b3bf0c] Jonathan Basseri (1): xfrm: Clear sk_dst_cache when applying per-socket policy. [2b06cdf3e688b98fcc9945873b5d42792bd4eee0] Kazuya Mizuguchi (1): usb: renesas_usbhs: Fix DMAC sequence for receiving zero-length packet [29c7f3e68eec4ae94d85ad7b5dfdafdb8089f513] Kevin Cernekee (4): brcmfmac: Add check for short event packets [dd2349121bb1b8ff688c3ca6a2a0bea9d8c142ca] netfilter: nfnetlink_cthelper: Add missing permission checks [4b380c42f7d00a395feede754f0bc2292eebe6e5] netfilter: xt_osf: Add missing permission checks [916a27901de01446bcf57ecca4783f6cff493309] netlink: Add netns check on taps [93c647643b48f0131f02e45da3bd367d80443291] Kirill A. Shutemov (1): mm, thp: Do not make page table dirty unconditionally in touch_p[mu]d() [a8f97366452ed491d13cf1e44241bc0b5740b1f0] Konstantin Khlebnikov (2): Smack: remove unneeded NULL-termination from securtity label [da1b63566c469bf3e2b24182114422e16b1aa34c] net_sched: always reset qdisc backlog in qdisc_reset() [c8e1812960eeae42e2183154927028511c4bc566] LEROY Christophe (2): crypto: talitos - Don't provide setkey for non hmac hashing algs. [56136631573baa537a15e0012055ffe8cfec1a33] crypto: talitos - fix sha224 [afd62fa26343be6445479e75de9f07092a061459] Lauro Ramos Venancio (1): sched/topology: Optimize build_group_mask() [f32d782e31bf079f600dcec126ed117b0577e85c] Li Bin (1): workqueue: Fix NULL pointer dereference [cef572ad9bd7f85035ba8272e5352040e8be0152] Luca Coelho (1): iwlwifi: mvm: use IWL_HCMD_NOCOPY for MCAST_FILTER_CMD [97bce57bd7f96e1218751996f549a6e61f18cc8c] Lukas Wunner (1): iio: adc: mcp320x: Fix oops on module unload [0964e40947a630a2a6f724e968246992f97bcf1c] Maksim Salau (1): usb: cdc_acm: Add quirk for Elatec TWN3 [765fb2f181cad669f2beb87842a05d8071f2be85] Marc Zyngier (1): arm64: Make sure SPsel is always set [5371513fb338fb9989c569dc071326d369d6ade8] Marek Szyprowski (1): iommu/exynos: Remove initconst attribute to avoid potential kernel oops [9d25e3cc83d731ae4eeb017fd07562fde3f80bef] Mark Rutland (3): ARM: 8720/1: ensure dump_instr() checks addr_limit [b9dd05c7002ee0ca8b676428b2268c26399b5e31] arm64: ensure __dump_instr() checks addr_limit [7a7003b1da010d2b0d1dc8bf21c10f5c73b389f1] arm64: fix dump_instr when PAN and UAO are in use [c5cea06be060f38e5400d796e61cfc8c36e52924] Martin K. Petersen (1): scsi: sd: Implement blacklist option for WRITE SAME w/ UNMAP [28a0bc4120d38a394499382ba21d6965a67a3703] Mathias Nyman (2): usb: hub: Allow reset retry for USB2 devices on connect bounce [1ac7db63333db1eeff901bfd6bbcd502b4634fa4] xhci: fix finding correct bus_state structure for USB 3.1 hosts [5a838a13c9b4e5dd188b7a6eaeb894e9358ead0c] Matt Bennett (1): ip6_gre: Reduce log level in ip6gre_err() to debug [a46496ce38eeb401344d5623c1960dbf2f1769be] Matt Fornero (1): iio: core: Return error for failed read_reg [3d62c78a6eb9a7d67bace9622b66ad51e81c5f9b] Matthew Wilcox (1): fs/mpage.c: fix mpage_writepage() for pages with buffers [f892760aa66a2d657deaf59538fb69433036767c] Mayank Rana (1): usb: xhci: Handle error condition in xhci_stop_device() [b3207c65dfafae27e7c492cb9188c0dc0eeaf3fd] Michael S. Tsirkin (1): macvtap: fix TUNSETSNDBUF values > 64k [3ea79249e81e5ed051f2e6480cbde896d99046e8] Miklos Szeredi (1): fuse: fix READDIRPLUS skipping an entry [c6cdd51404b7ac12dd95173ddfc548c59ecf037f] Mohamed Ghannam (1): dccp: CVE-2017-8824: use-after-free in DCCP code [69c64866ce072dea1d1e59a0d61e0f66c0dffb76] Nicolai Stange (1): PCI: Fix race condition with driver_override [9561475db680f7144d2223a409dd3d7e322aca03] Nicolas Dichtel (1): net: enable interface alias removal via rtnl [2459b4c635858094df78abb9ca87d99f89fe8ca5] Oleg Nesterov (1): ptrace: change __ptrace_unlink() to clear ->ptrace under ->siglock [1333ab03150478df8d6f5673a91df1e50dc6ab97] Omar Sandoval (1): Btrfs: fix incorrect {node,sector}size endianness from BTRFS_IOC_FS_INFO [bea7eafdbda3ba1d4b2ccb9cca829eefb7989bb9] Oswald Buddenhagen (1): MIPS: AR7: Ensure that serial ports are properly set up [b084116f8587b222a2c5ef6dcd846f40f24b9420] Paolo Abeni (4): IPv4: early demux can return an error code [7487449c86c65202b3b725c4524cb48dd65e4e6f] ipv4: fix broadcast packets reception [ad0ea1989cc4d5905941d0a9e62c63ad6d859cef] udp: fix bcast packet reception [996b44fcef8f216ea0b6b6e74468c5a77b5e341f] udp: perform source validation for mcast early demux [bc044e8db7962e727a75b591b9851ff2ac5cf846] Paul Burton (1): MIPS: Fix CM region target definitions [6a6cba1d945a7511cdfaf338526871195e420762] Peng Xu (1): nl80211: Define policy for packet pattern attributes [ad670233c9e1d5feb365d870e30083ef1b889177] Peter Zijlstra (3): sched/topology: Remove FORCE_SD_OVERLAP [af85596c74de2fd9abb87501ae280038ac28a3f4] sched/topology: Simplify build_overlap_sched_groups() [91eaed0d61319f58a9f8e43d41a8cbb069b4f73d] x86/uaccess, sched/preempt: Verify access_ok() context [7c4788950ba5922fde976d80b72baf46f14dee8d] Ravi Bangoria (1): powerpc/sysrq: Fix oops whem ppmu is not registered [4917fcb58cc73f6b81455e3c5f960144809ddf1a] Ricard Wanderlof (1): ASoC: adau17x1: Workaround for noise bug in ADC [1e6f4fc06f6411adf98bbbe7fcd79442cd2b2a75] Richard Schütz (1): can: c_can: don't indicate triple sampling support for D_CAN [fb5f0b3ef69b95e665e4bbe8a3de7201f09f1071] Ronnie Sahlberg (1): cifs: check rsp for NULL before dereferencing in SMB2_open [bf2afee14e07de16d3cafc67edbfc2a3cc65e4bc] Sabrina Dubroca (1): l2tp: fix race condition in l2tp_tunnel_delete [62b982eeb4589b2e6d7c01a90590e3a4c2b2ca19] Satoru Takeuchi (1): btrfs: prevent to set invalid default subvolid [6d6d282932d1a609e60dc4467677e0e863682f57] Sekhar Nori (1): ARM: dts: da850-evm: add serial and ethernet aliases [ce21574ad1922b403198ee664c4dff276f514f1d] Shrirang Bagul (1): USB: serial: qcserial: add Dell DW5818, DW5819 [f5d9644c5fca7d8e8972268598bb516a7eae17f9] Shu Wang (2): cifs: release auth_key.response for reconnect. [f5c4ba816315d3b813af16f5571f86c8d4e897bd] cifs: release cifs root_cred after exit_cifs [94183331e815617246b1baa97e0916f358c794bb] Stefan Mätje (1): can: esd_usb2: Fix can_dlc value for received RTR, frames [72d92e865d1560723e1957ee3f393688c49ca5bf] Stefan Popa (1): staging: iio: ad7192: Fix - use the dedicated reset function avoiding dma from stack. [f790923f146140a261ad211e5baf75d169f16fb2] Stefano Brivio (1): scsi: lpfc: Don't return internal MBXERR_ERROR code from probe function [5c756065e47dc3e84b00577bd109f0a8e69903d7] Steffen Maier (1): scsi: zfcp: fix erp_action use-before-initialize in REC action trace [ab31fd0ce65ec93828b617123792c1bb7c6dcc42] Steve French (3): SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags [1013e760d10e614dc10b5624ce9fc41563ba2e65] SMB3: Validate negotiate request must always be signed [4587eee04e2ac7ac3ac9fa2bc164fb6e548f99cd] SMB: Validate negotiate (to protect against downgrade) even if signing off [0603c96f3af50e2f9299fa410c224ab1d465e0f9] Tahsin Erdogan (1): tracing: Fix trace_pipe behavior for instance traces [75df6e688ccd517e339a7c422ef7ad73045b18a2] Takashi Iwai (10): ALSA: caiaq: Fix stray URB at probe error path [99fee508245825765ff60155fed43f970ff83a8f] ALSA: hda: Remove superfluous '-' added by printk conversion [6bf88a343db2b3c160edf9b82a74966b31cc80bd] ALSA: seq: Avoid invalid lockdep class warning [3510c7aa069aa83a2de6dab2b41401a198317bdc] ALSA: seq: Fix OSS sysex delivery in OSS emulation [132d358b183ac6ad8b3fea32ad5e0663456d18d1] ALSA: seq: Fix copy_from_user() call inside lock [5803b023881857db32ffefa0d269c90280a67ee0] ALSA: seq: Fix nested rwsem annotation for lockdep splat [1f20f9ff57ca23b9f5502fca85ce3977e8496cb1] ALSA: timer: Add missing mutex lock for compat ioctls [79fb0518fec8c8b4ea7f1729f54f293724b3dbb0] ALSA: timer: Limit max instances per timer [9b7d869ee5a77ed4a462372bb89af622e705bfb8] ALSA: timer: Protect the whole snd_timer_close() with open race [9984d1b5835ca29fc7025186a891ee7398d21cc7] ALSA: usx2y: Suppress kernel warning at page allocation failures [7682e399485fe19622b6fd82510b1f4551e48a25] Tejun Heo (1): workqueue: replace pool->manager_arb mutex with a flag [692b48258dda7c302e777d7d5f4217244478f1f6] Tyrel Datwyler (1): powerpc/pseries: Fix parent_dn reference leak in add_dt_node() [b537ca6fede69a281dc524983e5e633d79a10a08] Wanpeng Li (1): KVM: Fix stack-out-of-bounds read in write_mmio [e39d200fa5bf5b94a0948db0dae44c1b73b84a56] Will Deacon (1): arm64: fault: Route pte translation faults via do_translation_fault [760bfb47c36a07741a089bf6a28e854ffbee7dc9] Willem de Bruijn (1): packet: only test po->has_vnet_hdr once in packet_snd [da7c9561015e93d10fe6aab73e9288e0d09d65a6] Wolfgang Grandegger (1): can: gs_usb: fix busy loop if no more TX context is available [97819f943063b622eca44d3644067c190dc75039] Xin Long (4): ip6_gre: only increase err_count for some certain type icmpv6 in ip6gre_err [f8d20b46ce55cf40afb30dcef6d9288f7ef46d9b] ip6_gre: skb_push ipv6hdr before packing the header in ip6gre_header [76cc0d3282d4b933fa144fa41fbc5318e0fdca24] sctp: add the missing sock_owned_by_user check in sctp_icmp_redirect [1cc276cec9ec574d41cf47dfc0f51406b6f26ab4] sctp: fix a type cast warnings that causes a_rwnd gets the wrong value [f6fc6bc0b8e0bb13a210bd7386ffdcb1a5f30ef1] Yasuaki Ishimatsu (2): mm/memory_hotplug: change pfn_to_section_nr/section_nr_to_pfn macro to inline function [1dd2bfc86818ddbc95f98e312e7704350223fd7d] mm/memory_hotplug: define find_{smallest|biggest}_section_pfn as unsigned long [d09b0137d204bebeaafed672bc5a244e9ac92edb] Yazen Ghannam (1): x86/amd_nb: Add Fam17h Data Fabric as "Northbridge" [b791c6b6a55c402367cc544f54921074253db061] Ye Yin (1): netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed [2b5ec1a5f9738ee7bf8f5ec0526e75e00362c48f] Yoshihiro Shimoda (2): usb: renesas_usbhs: fix the BCLR setting condition for non-DCP pipe [6124607acc88fffeaadf3aacfeb3cc1304c87387] usb: renesas_usbhs: fix usbhsf_fifo_clear() for RX direction [0a2ce62b61f2c76d0213edf4e37aaf54a8ddf295] Makefile | 4 +- arch/alpha/kernel/ptrace.c | 2 +- arch/arm/boot/dts/da850-evm.dts | 7 ++ arch/arm/include/asm/Kbuild | 1 - arch/arm/include/asm/unaligned.h | 27 +++++ arch/arm/kernel/traps.c | 28 ++++-- arch/arm/kvm/emulate.c | 5 +- arch/arm/kvm/mmio.c | 4 +- arch/arm64/kernel/head.S | 1 + arch/arm64/kernel/traps.c | 28 +++--- arch/arm64/kvm/inject_fault.c | 16 ++- arch/arm64/mm/fault.c | 2 +- arch/blackfin/kernel/ptrace.c | 4 +- arch/cris/arch-v32/kernel/ptrace.c | 2 +- arch/ia64/kernel/ptrace.c | 2 +- arch/mips/ar7/platform.c | 1 + arch/mips/include/asm/mips-cm.h | 4 +- arch/mips/kernel/ptrace32.c | 4 +- arch/mips/mm/uasm-micromips.c | 2 +- arch/parisc/kernel/syscall.S | 6 +- arch/powerpc/kernel/ptrace32.c | 4 +- arch/powerpc/perf/core-book3s.c | 5 + arch/powerpc/platforms/pseries/mobility.c | 4 +- arch/s390/mm/gup.c | 7 +- arch/sh/include/cpu-sh2a/cpu/sh7264.h | 4 +- arch/sh/include/cpu-sh2a/cpu/sh7269.h | 4 +- arch/sh/include/cpu-sh4/cpu/sh7722.h | 2 +- arch/sh/include/cpu-sh4/cpu/sh7757.h | 8 +- arch/tile/kernel/setup.c | 2 +- arch/x86/crypto/salsa20_glue.c | 7 -- arch/x86/include/asm/uaccess.h | 14 ++- arch/x86/kernel/amd_nb.c | 48 +++++++++ arch/x86/kernel/cpu/microcode/intel.c | 18 ++++ arch/x86/kvm/vmx.c | 7 +- arch/x86/kvm/x86.c | 8 +- arch/x86/oprofile/op_model_ppro.c | 4 +- block/bio.c | 14 ++- crypto/hmac.c | 6 +- crypto/salsa20_generic.c | 7 -- crypto/shash.c | 13 ++- drivers/block/rbd.c | 4 +- drivers/bus/mvebu-mbus.c | 2 +- drivers/crypto/talitos.c | 7 +- drivers/gpio/gpiolib-acpi.c | 2 +- drivers/gpu/drm/i915/intel_bios.c | 7 ++ drivers/hid/i2c-hid/i2c-hid.c | 3 +- drivers/iio/adc/ad7793.c | 4 +- drivers/iio/adc/ad_sigma_delta.c | 28 ++++++ drivers/iio/adc/mcp320x.c | 1 + drivers/iio/industrialio-core.c | 4 +- drivers/infiniband/hw/mlx5/mem.c | 49 ++++----- drivers/infiniband/hw/ocrdma/ocrdma_hw.c | 3 + drivers/infiniband/ulp/ipoib/ipoib_ib.c | 13 --- drivers/input/ff-core.c | 13 ++- drivers/input/misc/ims-pcu.c | 16 ++- drivers/input/misc/uinput.c | 57 +++++++---- drivers/input/touchscreen/ti_am335x_tsc.c | 2 +- drivers/iommu/amd_iommu.c | 1 + drivers/iommu/exynos-iommu.c | 2 +- drivers/net/can/c_can/c_can_pci.c | 1 - drivers/net/can/c_can/c_can_platform.c | 1 - drivers/net/can/usb/esd_usb2.c | 2 +- drivers/net/can/usb/gs_usb.c | 10 +- drivers/net/can/usb/kvaser_usb.c | 3 +- drivers/net/macvtap.c | 6 +- drivers/net/tun.c | 7 ++ drivers/net/wireless/brcm80211/brcmfmac/fweh.c | 58 +++-------- drivers/net/wireless/brcm80211/brcmfmac/fweh.h | 68 ++++++++++--- drivers/net/wireless/brcm80211/brcmfmac/p2p.c | 10 ++ .../net/wireless/brcm80211/brcmfmac/wl_cfg80211.c | 5 + drivers/net/wireless/iwlwifi/mvm/mac80211.c | 10 +- drivers/pci/pci-sysfs.c | 11 ++- drivers/s390/scsi/zfcp_aux.c | 5 + drivers/s390/scsi/zfcp_erp.c | 18 ++-- drivers/s390/scsi/zfcp_scsi.c | 5 + drivers/scsi/libiscsi.c | 2 +- drivers/scsi/lpfc/lpfc_init.c | 1 + drivers/scsi/scsi_scan.c | 3 + drivers/scsi/sd.c | 16 ++- drivers/staging/iio/adc/ad7192.c | 4 +- drivers/staging/iio/meter/ade7759.c | 2 +- drivers/usb/class/cdc-acm.c | 3 + drivers/usb/core/config.c | 6 +- drivers/usb/core/hub.c | 13 ++- drivers/usb/core/quirks.c | 4 + drivers/usb/gadget/composite.c | 5 + drivers/usb/gadget/dummy_hcd.c | 81 +++++++++++---- drivers/usb/gadget/f_mass_storage.c | 31 ++---- drivers/usb/gadget/f_mass_storage.h | 14 --- drivers/usb/gadget/inode.c | 55 +++++++++-- drivers/usb/gadget/mass_storage.c | 20 +--- drivers/usb/gadget/net2280.c | 5 +- drivers/usb/host/pci-quirks.c | 8 +- drivers/usb/host/xhci-hub.c | 22 ++++- drivers/usb/host/xhci.h | 2 +- drivers/usb/renesas_usbhs/fifo.c | 23 ++++- drivers/usb/serial/cp210x.c | 1 + drivers/usb/serial/ftdi_sio.c | 2 + drivers/usb/serial/ftdi_sio_ids.h | 7 ++ drivers/usb/serial/metro-usb.c | 1 + drivers/usb/serial/option.c | 2 + drivers/usb/serial/qcserial.c | 4 + drivers/usb/storage/transport.c | 14 ++- drivers/usb/storage/unusual_devs.h | 7 ++ drivers/uwb/hwa-rc.c | 2 + drivers/uwb/uwbd.c | 12 ++- fs/block_dev.c | 6 +- fs/btrfs/ioctl.c | 10 +- fs/cifs/cifsfs.c | 2 +- fs/cifs/connect.c | 8 ++ fs/cifs/file.c | 7 ++ fs/cifs/smb2pdu.c | 34 +++++-- fs/direct-io.c | 3 +- fs/ecryptfs/ecryptfs_kernel.h | 25 +++-- fs/ecryptfs/keystore.c | 9 +- fs/exec.c | 22 ++++- fs/fscache/object-list.c | 7 ++ fs/fuse/dir.c | 3 +- fs/mpage.c | 14 ++- fs/ocfs2/alloc.c | 24 +++-- fs/read_write.c | 4 +- fs/xattr.c | 2 +- include/crypto/internal/hash.h | 8 ++ include/linux/buffer_head.h | 1 + include/linux/capability.h | 2 + include/linux/iio/adc/ad_sigma_delta.h | 3 + include/linux/input.h | 1 + include/linux/key.h | 2 + include/linux/mbus.h | 4 +- include/linux/mm.h | 2 + include/linux/mm_types.h | 1 + include/linux/mmzone.h | 10 +- include/linux/netdevice.h | 3 + include/linux/of.h | 10 ++ include/linux/pci_ids.h | 2 + include/linux/preempt_mask.h | 21 ++-- include/linux/ptrace.h | 11 ++- include/linux/sched.h | 1 + include/linux/skbuff.h | 7 ++ include/net/protocol.h | 2 +- include/net/route.h | 4 +- include/net/tcp.h | 8 +- include/net/udp.h | 2 +- include/scsi/scsi_device.h | 1 + include/scsi/scsi_devinfo.h | 1 + include/sound/seq_kernel.h | 3 +- include/sound/seq_virmidi.h | 1 + include/sound/timer.h | 2 + include/trace/events/kvm.h | 7 +- include/uapi/linux/spi/spidev.h | 1 + kernel/capability.c | 36 ++++++- kernel/fork.c | 9 +- kernel/params.c | 2 +- kernel/ptrace.c | 91 +++++++++++------ kernel/sched/core.c | 21 ++-- kernel/sched/features.h | 1 - kernel/sysctl.c | 3 +- kernel/trace/trace.c | 12 ++- kernel/workqueue.c | 37 +++---- kernel/workqueue_internal.h | 3 +- lib/asn1_decoder.c | 7 +- lib/digsig.c | 6 ++ mm/huge_memory.c | 14 +-- mm/init-mm.c | 2 + mm/memory.c | 2 +- mm/memory_hotplug.c | 6 +- mm/nommu.c | 2 +- net/8021q/vlan.c | 6 +- net/bluetooth/bnep/core.c | 3 + net/bluetooth/cmtp/core.c | 3 + net/core/dev.c | 6 +- net/core/rtnetlink.c | 5 +- net/core/skbuff.c | 1 + net/dccp/proto.c | 5 + net/ipv4/ip_input.c | 22 +++-- net/ipv4/ip_vti.c | 3 +- net/ipv4/route.c | 44 +++++---- net/ipv4/tcp_ipv4.c | 9 +- net/ipv4/tcp_output.c | 12 ++- net/ipv4/udp.c | 30 ++++-- net/ipv6/ip6_gre.c | 41 ++++---- net/ipv6/ip6_vti.c | 3 +- net/l2tp/l2tp_core.c | 10 +- net/l2tp/l2tp_core.h | 5 +- net/l2tp/l2tp_eth.c | 51 +--------- net/l2tp/l2tp_ip.c | 20 ++-- net/l2tp/l2tp_ip6.c | 21 ++-- net/l2tp/l2tp_ppp.c | 10 +- net/netfilter/ipset/ip_set_core.c | 23 +++-- net/netfilter/nfnetlink_cthelper.c | 10 ++ net/netfilter/x_tables.c | 4 +- net/netfilter/xt_osf.c | 7 ++ net/netlink/af_netlink.c | 3 + net/packet/af_packet.c | 4 +- net/sched/sch_generic.c | 1 + net/sctp/input.c | 2 +- net/sctp/sm_sideeffect.c | 4 +- net/unix/diag.c | 2 + net/wireless/nl80211.c | 12 ++- net/xfrm/xfrm_state.c | 1 + net/xfrm/xfrm_user.c | 3 +- security/keys/big_key.c | 2 +- security/keys/encrypted-keys/encrypted.c | 7 ++ security/keys/internal.h | 2 +- security/keys/key.c | 2 + security/keys/keyctl.c | 8 +- security/keys/keyring.c | 72 +++++++------- security/keys/process_keys.c | 8 +- security/keys/request_key.c | 46 +++++++-- security/keys/request_key_auth.c | 69 ++++++------- security/keys/trusted.c | 69 ++++++------- security/smack/smack_lsm.c | 65 ++++++------ security/yama/yama_lsm.c | 4 +- sound/core/hrtimer.c | 1 + sound/core/seq/oss/seq_oss_midi.c | 4 +- sound/core/seq/oss/seq_oss_readq.c | 29 ++++++ sound/core/seq/oss/seq_oss_readq.h | 2 + sound/core/seq/seq_clientmgr.c | 2 +- sound/core/seq/seq_virmidi.c | 27 +++-- sound/core/timer.c | 109 ++++++++++++++------- sound/core/timer_compat.c | 17 +++- sound/pci/hda/hda_codec.c | 2 +- sound/soc/codecs/adau17x1.c | 24 ++++- sound/soc/codecs/adau17x1.h | 2 + sound/usb/caiaq/device.c | 12 ++- sound/usb/usx2y/usb_stream.c | 6 +- 226 files changed, 1733 insertions(+), 945 deletions(-) -- Ben Hutchings The two most common things in the universe are hydrogen and stupidity.

7 years, 8 months

2
205
0 0

[Linux-stable-mirror] [PATCH] drm/i915/gvt: Clear the shadow page table entry after post-sync

by Zhi Wang

A shadow page table entry needs to be cleared after being set as post-sync. This patch fixes the recent error reported in Win7-32 test. Fixes: 2707e4446688 ("drm/i915/gvt: vGPU graphics memory virtualization") Signed-off-by: Zhi Wang <zhi.a.wang(a)intel.com> CC: Stable <stable(a)vger.kernel.org> --- drivers/gpu/drm/i915/gvt/gtt.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c index c4f752e..a529d2b 100644 --- a/drivers/gpu/drm/i915/gvt/gtt.c +++ b/drivers/gpu/drm/i915/gvt/gtt.c @@ -1410,12 +1410,15 @@ static int ppgtt_handle_guest_write_page_table_bytes( return ret; } else { if (!test_bit(index, spt->post_shadow_bitmap)) { + int type = spt->shadow_page.type; + ppgtt_get_shadow_entry(spt, &se, index); ret = ppgtt_handle_guest_entry_removal(gpt, &se, index); if (ret) return ret; + ops->set_pfn(&se, vgpu->gtt.scratch_pt[type].page_mfn); + ppgtt_set_shadow_entry(spt, &se, index); } - ppgtt_set_post_shadow(spt, index); } -- 2.7.4

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH] x86/mm: Set MODULES_END to 0xffffffffff000000

by Andrey Ryabinin

Since f06bdd4001c2 ("x86/mm: Adapt MODULES_END based on fixmap section size") kasan_mem_to_shadow(MODULES_END) could be not aligned to a page boundary. So passing page unaligned address to kasan_populate_zero_shadow() have two possible effects: 1) It may leave one page hole in supposed to be populated area. After commit 21506525fb8d ("x86/kasan/64: Teach KASAN about the cpu_entry_area") that hole happens to be in the shadow covering fixmap area and leads to crash: BUG: unable to handle kernel paging request at fffffbffffe8ee04 RIP: 0010:check_memory_region+0x5c/0x190 Call Trace: <NMI> memcpy+0x1f/0x50 ghes_copy_tofrom_phys+0xab/0x180 ghes_read_estatus+0xfb/0x280 ghes_notify_nmi+0x2b2/0x410 nmi_handle+0x115/0x2c0 default_do_nmi+0x57/0x110 do_nmi+0xf8/0x150 end_repeat_nmi+0x1a/0x1e Note, the crash likely disappeared after commit 92a0f81d8957, which changed kasan_populate_zero_shadow() call the way it was before commit 21506525fb8d. 2) Attempt to load module near MODULES_END will fail, because __vmalloc_node_range() called from kasan_module_alloc() will hit the WARN_ON(!pte_none(*pte)) in the vmap_pte_range() and bail out with error. To fix this we need to make kasan_mem_to_shadow(MODULES_END) page aligned which means that MODULES_END should be 8*PAGE_SIZE aligned. The whole point of commit f06bdd4001c2 was to move MODULES_END down if NR_CPUS is big, so the cpu_entry_area takes a lot of space. But since 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap") the cpu_entry_area is no longer in fixmap, so we could just set MODULES_END to a fixed 8*PAGE_SIZE aligned address. Reported-by: Jakub Kicinski <kubakici(a)wp.pl> Signed-off-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com> Cc: <stable(a)vger.kernel.org> # 4.14 --- Documentation/x86/x86_64/mm.txt | 5 +---- arch/x86/include/asm/pgtable_64_types.h | 2 +- 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt index 51101708a03a..60a0fa2bc01f 100644 --- a/Documentation/x86/x86_64/mm.txt +++ b/Documentation/x86/x86_64/mm.txt @@ -42,7 +42,7 @@ ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space ... unused hole ... ffffffff80000000 - ffffffff9fffffff (=512 MB) kernel text mapping, from phys 0 -ffffffffa0000000 - [fixmap start] (~1526 MB) module mapping space +ffffffffa0000000 - fffffffffeffffff (1520 MB) module mapping space [fixmap start] - ffffffffff5fffff kernel-internal fixmap range ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole @@ -66,9 +66,6 @@ memory window (this size is arbitrary, it can be raised later if needed). The mappings are not part of any other kernel PGD and are only available during EFI runtime calls. -The module mapping space size changes based on the CONFIG requirements for the -following fixmap section. - Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all physical memory, vmalloc/ioremap space and virtual memory map are randomized. Their order is preserved but their base will be offset early at boot time. diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h index 3d27831bc58d..ed2da4e7f259 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -100,7 +100,7 @@ typedef struct { pteval_t pte; } pte_t; #define MODULES_VADDR (__START_KERNEL_map + KERNEL_IMAGE_SIZE) /* The module sections ends with the start of the fixmap */ -#define MODULES_END __fix_to_virt(__end_of_fixed_addresses + 1) +#define MODULES_END _AC(0xffffffffff000000, UL) #define MODULES_LEN (MODULES_END - MODULES_VADDR) #define ESPFIX_PGD_ENTRY _AC(-2, UL) -- 2.13.6

7 years, 8 months

1
0
0 0

Re: [Linux-stable-mirror] [PATCH] x86, nmi, dumpstack: Fix panic when syscall interruptted by nmi

by Andy Lutomirski

On Tue, Dec 26, 2017 at 1:11 AM, 王元良 <yuanliang.wyl(a)alibaba-inc.com> wrote: > To Andy and josh： > > Forgive me for not expressing clearly， The context is that we encountered > hundreds of this panic in stable kernel, and hope this change can be merged > into the stable branch linux-3.16.y > > Regards, > Yuanliang Wang > You could try to backport the upstream fix to 3.16. Josh or I could review it. --Andy

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH rdma-rc 1/4] RDMA/core: Enforce requirement to hold lists_rwsem semaphore

by Leon Romanovsky

From: Leon Romanovsky <leonro(a)mellanox.com> Add comment and run time check to the __ib_device_get_by_index() function to remind that the caller should hold lists_rwsem semaphore. Cc: <stable(a)vger.kernel.org> # v4.13 Fixes: ecc82c53f9a4 ("RDMA/core: Add and expose static device index") Reviewed-by: Mark Bloch <markb(a)mellanox.com> Signed-off-by: Leon Romanovsky <leonro(a)mellanox.com> --- drivers/infiniband/core/device.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 30914f3baa5f..3aeaf11d0a83 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -134,10 +134,15 @@ static int ib_device_check_mandatory(struct ib_device *device) return 0; } +/* + * Caller to this function should hold lists_rwsem + */ struct ib_device *__ib_device_get_by_index(u32 index) { struct ib_device *device; + WARN_ON_ONCE(!rwsem_is_locked(&lists_rwsem)); + list_for_each_entry(device, &device_list, core_list) if (device->index == index) return device; @@ -526,8 +531,8 @@ int ib_register_device(struct ib_device *device, if (!add_client_context(device, client) && client->add) client->add(device); - device->index = __dev_new_index(); down_write(&lists_rwsem); + device->index = __dev_new_index(); list_add_tail(&device->core_list, &device_list); up_write(&lists_rwsem); mutex_unlock(&device_mutex); -- 2.15.1

7 years, 8 months

2
1
0 0

[Linux-stable-mirror] Patch "bpf/verifier: Fix states_equal() comparison of pointer and UNKNOWN" has been added to the 4.9-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled bpf/verifier: Fix states_equal() comparison of pointer and UNKNOWN to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: bpf-verifier-fix-states_equal-comparison-of-pointer-and-unknown.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From ben(a)decadent.org.uk Wed Dec 27 21:04:06 2017 From: Ben Hutchings <ben(a)decadent.org.uk> Date: Sat, 23 Dec 2017 02:26:17 +0000 Subject: bpf/verifier: Fix states_equal() comparison of pointer and UNKNOWN To: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: stable(a)vger.kernel.org, netdev(a)vger.kernel.org, Edward Cree <ecree(a)solarflare.com>, Jann Horn <jannh(a)google.com>, Alexei Starovoitov <ast(a)kernel.org> Message-ID: <20171223022617.GO2971(a)decadent.org.uk> Content-Disposition: inline From: Ben Hutchings <ben(a)decadent.org.uk> An UNKNOWN_VALUE is not supposed to be derived from a pointer, unless pointer leaks are allowed. Therefore, states_equal() must not treat a state with a pointer in a register as "equal" to a state with an UNKNOWN_VALUE in that register. This was fixed differently upstream, but the code around here was largely rewritten in 4.14 by commit f1174f77b50c "bpf/verifier: rework value tracking". The bug can be detected by the bpf/verifier sub-test "pointer/scalar confusion in state equality check (way 1)". Signed-off-by: Ben Hutchings <ben(a)decadent.org.uk> Cc: Edward Cree <ecree(a)solarflare.com> Cc: Jann Horn <jannh(a)google.com> Cc: Alexei Starovoitov <ast(a)kernel.org> Cc: Daniel Borkmann <daniel(a)iogearbox.net> --- kernel/bpf/verifier.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2722,11 +2722,12 @@ static bool states_equal(struct bpf_veri /* If we didn't map access then again we don't care about the * mismatched range values and it's ok if our old type was - * UNKNOWN and we didn't go to a NOT_INIT'ed reg. + * UNKNOWN and we didn't go to a NOT_INIT'ed or pointer reg. */ if (rold->type == NOT_INIT || (!varlen_map_access && rold->type == UNKNOWN_VALUE && - rcur->type != NOT_INIT)) + rcur->type != NOT_INIT && + !__is_pointer_value(env->allow_ptr_leaks, rcur))) continue; /* Don't care about the reg->id in this case. */ Patches currently in stable-queue which might be from ben(a)decadent.org.uk are queue-4.9/bpf-verifier-fix-states_equal-comparison-of-pointer-and-unknown.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH V1 0/1] Fix kernel panic caused by device ID duplication presented to the IOMMU

by Tomasz Nowicki

Here is my lspci output of ThunderX2 for which I am observing kernel panic coming from SMMUv3 driver -> arm_smmu_write_strtab_ent() -> BUG_ON(ste_live): # lspci -vt -[0000:00]-+-00.0-[01-1f]--+ [...] + [...] \-00.0-[1e-1f]----00.0-[1f]----00.0 ASPEED Technology, Inc. ASPEED Graphics Family ASP device -> 1f:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family PCI-Express to PCI/PCI-X Bridge -> 1e:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge While setting up ASP device SID in IORT dirver: iort_iommu_configure() -> pci_for_each_dma_alias() we need to walk up and iterate over each device which alias transaction from downstream devices. AST device (1f:00.0) gets BDF=0x1f00 and corresponding SID=0x1f00 from IORT. Bridge (1e:00.0) is the first alias. Following PCI Express to PCI/PCI-X Bridge spec: PCIe-to-PCI/X bridges alias transactions from downstream devices using the subordinate bus number. For bridge (1e:00.0), the subordinate is equal to 0x1f. This gives BDF=0x1f00 and SID=1f00 which is the same as downstream device. So it is possible to have two identical SIDs. The question is what we should do about such case. Presented patch prevents from registering the same ID so that SMMUv3 is not complaining later on. Tomasz Nowicki (1): iommu: Make sure device's ID array elements are unique drivers/iommu/iommu.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) -- 2.7.4

7 years, 8 months

5
7
0 0

[Linux-stable-mirror] [PATCH 5/5] tracing: Fix possible double free on failure of allocating trace buffer

by Steven Rostedt

From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> Jing Xia and Chunyan Zhang reported that on failing to allocate part of the tracing buffer, memory is freed, but the pointers that point to them are not initialized back to NULL, and later paths may try to free the freed memory again. Jing and Chunyan fixed one of the locations that does this, but missed a spot. Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Cc: stable(a)vger.kernel.org Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Reported-by: Jing Xia <jing.xia(a)spreadtrum.com> Reported-by: Chunyan Zhang <chunyan.zhang(a)spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> --- kernel/trace/trace.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 0e53d46544b8..2a8d8a294345 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -7580,6 +7580,7 @@ allocate_trace_buffer(struct trace_array *tr, struct trace_buffer *buf, int size buf->data = alloc_percpu(struct trace_array_cpu); if (!buf->data) { ring_buffer_free(buf->buffer); + buf->buffer = NULL; return -ENOMEM; } -- 2.13.2

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH 4/5] tracing: Fix crash when it fails to alloc ring buffer

by Steven Rostedt

From: Jing Xia <jing.xia(a)spreadtrum.com> Double free of the ring buffer happens when it fails to alloc new ring buffer instance for max_buffer if TRACER_MAX_TRACE is configured. The root cause is that the pointer is not set to NULL after the buffer is freed in allocate_trace_buffers(), and the freeing of the ring buffer is invoked again later if the pointer is not equal to Null, as: instance_mkdir() |-allocate_trace_buffers() |-allocate_trace_buffer(tr, &tr->trace_buffer...) |-allocate_trace_buffer(tr, &tr->max_buffer...) // allocate fail(-ENOMEM),first free // and the buffer pointer is not set to null |-ring_buffer_free(tr->trace_buffer.buffer) // out_free_tr |-free_trace_buffers() |-free_trace_buffer(&tr->trace_buffer); //if trace_buffer is not null, free again |-ring_buffer_free(buf->buffer) |-rb_free_cpu_buffer(buffer->buffers[cpu]) // ring_buffer_per_cpu is null, and // crash in ring_buffer_per_cpu->pages Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Cc: stable(a)vger.kernel.org Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Signed-off-by: Jing Xia <jing.xia(a)spreadtrum.com> Signed-off-by: Chunyan Zhang <chunyan.zhang(a)spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> --- kernel/trace/trace.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 73652d5318b2..0e53d46544b8 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -7603,7 +7603,9 @@ static int allocate_trace_buffers(struct trace_array *tr, int size) allocate_snapshot ? size : 1); if (WARN_ON(ret)) { ring_buffer_free(tr->trace_buffer.buffer); + tr->trace_buffer.buffer = NULL; free_percpu(tr->trace_buffer.data); + tr->trace_buffer.data = NULL; return -ENOMEM; } tr->allocated_snapshot = allocate_snapshot; -- 2.13.2

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH 3/5] ring-buffer: Do no reuse reader page if still in use

by Steven Rostedt

From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> To free the reader page that is allocated with ring_buffer_alloc_read_page(), ring_buffer_free_read_page() must be called. For faster performance, this page can be reused by the ring buffer to avoid having to free and allocate new pages. The issue arises when the page is used with a splice pipe into the networking code. The networking code may up the page counter for the page, and keep it active while sending it is queued to go to the network. The incrementing of the page ref does not prevent it from being reused in the ring buffer, and this can cause the page that is being sent out to the network to be modified before it is sent by reading new data. Add a check to the page ref counter, and only reuse the page if it is not being used anywhere else. Cc: stable(a)vger.kernel.org Fixes: 73a757e63114d ("ring-buffer: Return reader page back into existing ring buffer") Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> --- kernel/trace/ring_buffer.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index e06cde093f76..9ab18995ff1e 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -4404,8 +4404,13 @@ void ring_buffer_free_read_page(struct ring_buffer *buffer, int cpu, void *data) { struct ring_buffer_per_cpu *cpu_buffer = buffer->buffers[cpu]; struct buffer_data_page *bpage = data; + struct page *page = virt_to_page(bpage); unsigned long flags; + /* If the page is still in use someplace else, we can't reuse it */ + if (page_ref_count(page) > 1) + goto out; + local_irq_save(flags); arch_spin_lock(&cpu_buffer->lock); @@ -4417,6 +4422,7 @@ void ring_buffer_free_read_page(struct ring_buffer *buffer, int cpu, void *data) arch_spin_unlock(&cpu_buffer->lock); local_irq_restore(flags); + out: free_page((unsigned long)bpage); } EXPORT_SYMBOL_GPL(ring_buffer_free_read_page); -- 2.13.2

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH 2/5] tracing: Remove extra zeroing out of the ring buffer page

by Steven Rostedt

From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> The ring_buffer_read_page() takes care of zeroing out any extra data in the page that it returns. There's no need to zero it out again from the consumer. It was removed from one consumer of this function, but read_buffers_splice_read() did not remove it, and worse, it contained a nasty bug because of it. Cc: stable(a)vger.kernel.org Fixes: 2711ca237a084 ("ring-buffer: Move zeroing out excess in page to ring buffer code") Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> --- kernel/trace/trace.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 59518b8126d0..73652d5318b2 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -6769,7 +6769,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, .spd_release = buffer_spd_release, }; struct buffer_ref *ref; - int entries, size, i; + int entries, i; ssize_t ret = 0; #ifdef CONFIG_TRACER_MAX_TRACE @@ -6823,14 +6823,6 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, break; } - /* - * zero out any left over data, this is going to - * user land. - */ - size = ring_buffer_page_len(ref->page); - if (size < PAGE_SIZE) - memset(ref->page + size, 0, PAGE_SIZE - size); - page = virt_to_page(ref->page); spd.pages[i] = page; -- 2.13.2

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH 1/5] ring-buffer: Mask out the info bits when returning buffer page length

by Steven Rostedt

From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> Two info bits were added to the "commit" part of the ring buffer data page when returned to be consumed. This was to inform the user space readers that events have been missed, and that the count may be stored at the end of the page. What wasn't handled, was the splice code that actually called a function to return the length of the data in order to zero out the rest of the page before sending it up to user space. These data bits were returned with the length making the value negative, and that negative value was not checked. It was compared to PAGE_SIZE, and only used if the size was less than PAGE_SIZE. Luckily PAGE_SIZE is unsigned long which made the compare an unsigned compare, meaning the negative size value did not end up causing a large portion of memory to be randomly zeroed out. Cc: stable(a)vger.kernel.org Fixes: 66a8cb95ed040 ("ring-buffer: Add place holder recording of dropped events") Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> --- kernel/trace/ring_buffer.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index c87766c1c204..e06cde093f76 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -280,6 +280,8 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data); /* Missed count stored at end */ #define RB_MISSED_STORED (1 << 30) +#define RB_MISSED_FLAGS (RB_MISSED_EVENTS|RB_MISSED_STORED) + struct buffer_data_page { u64 time_stamp; /* page time stamp */ local_t commit; /* write committed index */ @@ -331,7 +333,9 @@ static void rb_init_page(struct buffer_data_page *bpage) */ size_t ring_buffer_page_len(void *page) { - return local_read(&((struct buffer_data_page *)page)->commit) + struct buffer_data_page *bpage = page; + + return (local_read(&bpage->commit) & ~RB_MISSED_FLAGS) + BUF_PAGE_HDR_SIZE; } -- 2.13.2

7 years, 8 months

1
0
0 0

Re: [Linux-stable-mirror] v4.14.9 BUG, regression

by Greg KH

On Tue, Dec 26, 2017 at 03:00:55AM +0100, Petr Janecek wrote: > Hi, > the machine started dying reliably shortly after boot after upgrading > to 4.14.9 from 4.14.8. Debian stretch, xfs on md raid10, btrfs. > > > [ 230.855352] BUG: unable to handle kernel paging request at 0000000100000001 > [ 230.862449] IP: free_block+0x135/0x1f0 > [ 230.866301] PGD 0 P4D 0 > [ 230.868939] Oops: 0002 [#1] SMP > [ 230.872178] Modules linked in: xfs x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel pcbc aesni_intel mei_me iTCO_wdt aes_x86_64 crypto_simd cryptd glue_helper pcspkr iTCO_vendor_support ipmi_si tpm_tis mei evdev tpm_tis_core ipmi_devintf ipmi_msghandler battery sg video tpm acpi_power_meter button ie31200_edac shpchp nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc loop ip_tables x_tables autofs4 btrfs zstd_decompress zstd_compress xxhash raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear raid10 md_mod uas usb_storage hid_generic usbhid hid sd_mod igb ahci xhci_pci i2c_algo_bit libahci mpt3sas xhci_hcd dca raid_class ptp libata i2c_i801 scsi_transport_sas crc32c_intel usbcore > [ 230.943379] i2c_core pps_core usb_common scsi_mod fan thermal > [ 230.949317] CPU: 0 PID: 57 Comm: kworker/0:1 Not tainted 4.14.9 #23 > [ 230.955688] Hardware name: Supermicro Super Server/X11SSL-CF, BIOS 1.0a 01/29/2016 > [ 230.963357] Workqueue: events cache_reap > [ 230.967380] task: ffff8864b362c000 task.stack: ffff9f0d43380000 > [ 230.973420] RIP: 0010:free_block+0x135/0x1f0 > [ 230.977806] RSP: 0018:ffff9f0d43383d88 EFLAGS: 00010006 > [ 230.983146] RAX: ffffe8625d2e3908 RBX: 0000000080000000 RCX: 0000000000000003 > [ 230.990398] RDX: 00000000fffffffe RSI: ffff8864b7822df0 RDI: ffff8864b6c00480 > [ 230.997654] RBP: dead000000000200 R08: ffff8864b6c01558 R09: ffff8864b6c01540 > [ 231.004909] R10: 00000000006fe8f1 R11: ffff886496597e00 R12: dead000000000100 > [ 231.012162] R13: ffffffff80000000 R14: ffff8864b7822e28 R15: ffffe8625d2e3928 > [ 231.019416] FS: 0000000000000000(0000) GS:ffff8864b7800000(0000) knlGS:0000000000000000 > [ 231.027643] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 231.033509] CR2: 0000000100000001 CR3: 00000005e680b006 CR4: 00000000003606f0 > [ 231.040763] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 231.048017] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 231.055272] Call Trace: > [ 231.057840] drain_array_locked+0x5a/0x90 > [ 231.061963] drain_array+0x60/0x80 > [ 231.065484] cache_reap+0x67/0x1d0 > [ 231.069004] process_one_work+0x1c0/0x3e0 > [ 231.073128] worker_thread+0x42/0x3e0 > [ 231.076907] kthread+0xf7/0x130 > [ 231.080167] ? create_worker+0x180/0x180 > [ 231.084206] ? kthread_create_on_node+0x40/0x40 > [ 231.088850] ret_from_fork+0x1f/0x30 > [ 231.092542] Code: 4f 1c 49 c1 ea 20 44 29 d2 d3 ea 0f b6 4f 1d 41 01 d2 41 d3 ea 8b 48 18 8d 51 ff 48 8b 48 10 89 50 18 48 85 c9 0f 84 a3 00 00 00 <44> 88 14 11 8b 50 18 85 d2 0f 84 26 ff ff ff 49 8b 51 20 48 83 > [ 231.111583] RIP: free_block+0x135/0x1f0 RSP: ffff9f0d43383d88 > [ 231.117443] CR2: 0000000100000001 > [ 231.120875] ---[ end trace e4cdf71e69fa010e ]--- Any chance you can run 'git bisect' to find the offending patch here? thanks, greg k-h

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] FAILED: patch "[PATCH] kvm: x86: fix WARN due to uninitialized guest FPU state" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 5663d8f9bbe4bf15488f7351efb61ea20fa6de06 Mon Sep 17 00:00:00 2001 From: Peter Xu <peterx(a)redhat.com> Date: Tue, 12 Dec 2017 17:15:02 +0100 Subject: [PATCH] kvm: x86: fix WARN due to uninitialized guest FPU state ------------[ cut here ]------------ Bad FPU state detected at kvm_put_guest_fpu+0xd8/0x2d0 [kvm], reinitializing FPU registers. WARNING: CPU: 1 PID: 4594 at arch/x86/mm/extable.c:103 ex_handler_fprestore+0x88/0x90 CPU: 1 PID: 4594 Comm: qemu-system-x86 Tainted: G B OE 4.15.0-rc2+ #10 RIP: 0010:ex_handler_fprestore+0x88/0x90 Call Trace: fixup_exception+0x4e/0x60 do_general_protection+0xff/0x270 general_protection+0x22/0x30 RIP: 0010:kvm_put_guest_fpu+0xd8/0x2d0 [kvm] RSP: 0018:ffff8803d5627810 EFLAGS: 00010246 kvm_vcpu_reset+0x3b4/0x3c0 [kvm] kvm_apic_accept_events+0x1c0/0x240 [kvm] kvm_arch_vcpu_ioctl_run+0x1658/0x2fb0 [kvm] kvm_vcpu_ioctl+0x479/0x880 [kvm] do_vfs_ioctl+0x142/0x9a0 SyS_ioctl+0x74/0x80 do_syscall_64+0x15f/0x600 where kvm_put_guest_fpu is called without a prior kvm_load_guest_fpu. To fix it, move kvm_load_guest_fpu to the very beginning of kvm_arch_vcpu_ioctl_run. Cc: stable(a)vger.kernel.org Fixes: f775b13eedee2f7f3c6fdd4e90fb79090ce5d339 Signed-off-by: Peter Xu <peterx(a)redhat.com> Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 154ea27746e9..56d036b9ad75 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7264,13 +7264,12 @@ static int complete_emulated_mmio(struct kvm_vcpu *vcpu) int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { - struct fpu *fpu = &current->thread.fpu; int r; - fpu__initialize(fpu); - kvm_sigset_activate(vcpu); + kvm_load_guest_fpu(vcpu); + if (unlikely(vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED)) { if (kvm_run->immediate_exit) { r = -EINTR; @@ -7296,14 +7295,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) } } - kvm_load_guest_fpu(vcpu); - if (unlikely(vcpu->arch.complete_userspace_io)) { int (*cui)(struct kvm_vcpu *) = vcpu->arch.complete_userspace_io; vcpu->arch.complete_userspace_io = NULL; r = cui(vcpu); if (r <= 0) - goto out_fpu; + goto out; } else WARN_ON(vcpu->arch.pio.count || vcpu->mmio_needed); @@ -7312,9 +7309,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) else r = vcpu_run(vcpu); -out_fpu: - kvm_put_guest_fpu(vcpu); out: + kvm_put_guest_fpu(vcpu); post_kvm_run_save(vcpu); kvm_sigset_deactivate(vcpu);

7 years, 8 months

2
1
0 0

[Linux-stable-mirror] [PATCH] Revert "ipmi_si: fix memory leak on new_smi"

by minyard＠acm.org

From: Corey Minyard <cminyard(a)mvista.com> This reverts commit c97e41076a298dbc4e910c33048e553658388eed. A backport was requested of c0a32fe13cd32 "ipmi_si: fix memory leak on new_smi", but the backport shouldn't have been done. This change needs to be reverted, as it can result in an oops and the previous code is correct. Reverts: c97e41076a29 ("ipmi_si: fix memory leak on new_smi") Link: https://bbs.archlinux.org/viewtopic.php?pid=1757130#p1757130 Reported-by: Neil Romig <neil(a)sixtythree.me.uk> Cc: Sasha Levin <alexander.levin(a)verizon.com> Signed-off-by: Corey Minyard <cminyard(a)mvista.com> --- This is for 4.14 stable tree only. In hindsight, I should scrutinize stable kernel requests from others in the IPMI tree. Sorry about that. drivers/char/ipmi/ipmi_si_intf.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c index e1cbb78..c04aa11 100644 --- a/drivers/char/ipmi/ipmi_si_intf.c +++ b/drivers/char/ipmi/ipmi_si_intf.c @@ -3469,7 +3469,6 @@ static int add_smi(struct smi_info *new_smi) ipmi_addr_src_to_str(new_smi->addr_source), si_to_str[new_smi->si_type]); rv = -EBUSY; - kfree(new_smi); goto out_err; } } -- 2.7.4

7 years, 8 months

2
1
0 0

[Linux-stable-mirror] Patch "libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment" has been added to the 4.9-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: libnvdimm-dax-fix-1gb-aligned-namespaces-vs-physical-misalignment.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 41fce90f26333c4fa82e8e43b9ace86c4e8a0120 Mon Sep 17 00:00:00 2001 From: Dan Williams <dan.j.williams(a)intel.com> Date: Mon, 4 Dec 2017 14:07:43 -0800 Subject: libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment From: Dan Williams <dan.j.williams(a)intel.com> commit 41fce90f26333c4fa82e8e43b9ace86c4e8a0120 upstream. The following namespace configuration attempt: # ndctl create-namespace -e namespace0.0 -m devdax -a 1G -f libndctl: ndctl_dax_enable: dax0.1: failed to enable Error: namespace0.0: failed to enable failed to reconfigure namespace: No such device or address ...fails when the backing memory range is not physically aligned to 1G: # cat /proc/iomem | grep Persistent 210000000-30fffffff : Persistent Memory (legacy) In the above example the 4G persistent memory range starts and ends on a 256MB boundary. We handle this case correctly when needing to handle cases that violate section alignment (128MB) collisions against "System RAM", and we simply need to extend that padding/truncation for the 1GB alignment use case. Fixes: 315c562536c4 ("libnvdimm, pfn: add 'align' attribute...") Reported-and-tested-by: Jane Chu <jane.chu(a)oracle.com> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/nvdimm/pfn_devs.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -562,6 +562,12 @@ static struct vmem_altmap *__nvdimm_setu return altmap; } +static u64 phys_pmem_align_down(struct nd_pfn *nd_pfn, u64 phys) +{ + return min_t(u64, PHYS_SECTION_ALIGN_DOWN(phys), + ALIGN_DOWN(phys, nd_pfn->align)); +} + static int nd_pfn_init(struct nd_pfn *nd_pfn) { u32 dax_label_reserve = is_nd_dax(&nd_pfn->dev) ? SZ_128K : 0; @@ -617,13 +623,16 @@ static int nd_pfn_init(struct nd_pfn *nd start = nsio->res.start; size = PHYS_SECTION_ALIGN_UP(start + size) - start; if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM, - IORES_DESC_NONE) == REGION_MIXED) { + IORES_DESC_NONE) == REGION_MIXED + || !IS_ALIGNED(start + resource_size(&nsio->res), + nd_pfn->align)) { size = resource_size(&nsio->res); - end_trunc = start + size - PHYS_SECTION_ALIGN_DOWN(start + size); + end_trunc = start + size - phys_pmem_align_down(nd_pfn, + start + size); } if (start_pad + end_trunc) - dev_info(&nd_pfn->dev, "%s section collision, truncate %d bytes\n", + dev_info(&nd_pfn->dev, "%s alignment collision, truncate %d bytes\n", dev_name(&ndns->dev), start_pad + end_trunc); /* Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are queue-4.9/libnvdimm-pfn-fix-start_pad-handling-for-aligned-namespaces.patch queue-4.9/libnvdimm-dax-fix-1gb-aligned-namespaces-vs-physical-misalignment.patch queue-4.9/acpi-nfit-fix-health-event-notification.patch

7 years, 8 months

2
1
0 0

[Linux-stable-mirror] FAILED: patch "[PATCH] libnvdimm, dax: fix 1GB-aligned namespaces vs physical" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 41fce90f26333c4fa82e8e43b9ace86c4e8a0120 Mon Sep 17 00:00:00 2001 From: Dan Williams <dan.j.williams(a)intel.com> Date: Mon, 4 Dec 2017 14:07:43 -0800 Subject: [PATCH] libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment The following namespace configuration attempt: # ndctl create-namespace -e namespace0.0 -m devdax -a 1G -f libndctl: ndctl_dax_enable: dax0.1: failed to enable Error: namespace0.0: failed to enable failed to reconfigure namespace: No such device or address ...fails when the backing memory range is not physically aligned to 1G: # cat /proc/iomem | grep Persistent 210000000-30fffffff : Persistent Memory (legacy) In the above example the 4G persistent memory range starts and ends on a 256MB boundary. We handle this case correctly when needing to handle cases that violate section alignment (128MB) collisions against "System RAM", and we simply need to extend that padding/truncation for the 1GB alignment use case. Cc: <stable(a)vger.kernel.org> Fixes: 315c562536c4 ("libnvdimm, pfn: add 'align' attribute...") Reported-and-tested-by: Jane Chu <jane.chu(a)oracle.com> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index db2fc7c02e01..2adada1a5855 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -583,6 +583,12 @@ static struct vmem_altmap *__nvdimm_setup_pfn(struct nd_pfn *nd_pfn, return altmap; } +static u64 phys_pmem_align_down(struct nd_pfn *nd_pfn, u64 phys) +{ + return min_t(u64, PHYS_SECTION_ALIGN_DOWN(phys), + ALIGN_DOWN(phys, nd_pfn->align)); +} + static int nd_pfn_init(struct nd_pfn *nd_pfn) { u32 dax_label_reserve = is_nd_dax(&nd_pfn->dev) ? SZ_128K : 0; @@ -638,13 +644,16 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) start = nsio->res.start; size = PHYS_SECTION_ALIGN_UP(start + size) - start; if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM, - IORES_DESC_NONE) == REGION_MIXED) { + IORES_DESC_NONE) == REGION_MIXED + || !IS_ALIGNED(start + resource_size(&nsio->res), + nd_pfn->align)) { size = resource_size(&nsio->res); - end_trunc = start + size - PHYS_SECTION_ALIGN_DOWN(start + size); + end_trunc = start + size - phys_pmem_align_down(nd_pfn, + start + size); } if (start_pad + end_trunc) - dev_info(&nd_pfn->dev, "%s section collision, truncate %d bytes\n", + dev_info(&nd_pfn->dev, "%s alignment collision, truncate %d bytes\n", dev_name(&ndns->dev), start_pad + end_trunc); /*

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "pinctrl: cherryview: Mask all interrupts on Intel_Strago based systems" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled pinctrl: cherryview: Mask all interrupts on Intel_Strago based systems to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: pinctrl-cherryview-mask-all-interrupts-on-intel_strago-based-systems.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From d2b3c353595a855794f8b9df5b5bdbe8deb0c413 Mon Sep 17 00:00:00 2001 From: Mika Westerberg <mika.westerberg(a)linux.intel.com> Date: Mon, 4 Dec 2017 12:11:02 +0300 Subject: pinctrl: cherryview: Mask all interrupts on Intel_Strago based systems From: Mika Westerberg <mika.westerberg(a)linux.intel.com> commit d2b3c353595a855794f8b9df5b5bdbe8deb0c413 upstream. Guenter Roeck reported an interrupt storm on a prototype system which is based on Cyan Chromebook. The root cause turned out to be a incorrectly configured pin that triggers spurious interrupts. This will be fixed in coreboot but currently we need to prevent the interrupt storm from happening by masking all interrupts (but not GPEs) on those systems. Link: https://bugzilla.kernel.org/show_bug.cgi?id=197953 Fixes: bcb48cca23ec ("pinctrl: cherryview: Do not mask all interrupts in probe") Reported-and-tested-by: Guenter Roeck <linux(a)roeck-us.net> Reported-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com> Signed-off-by: Mika Westerberg <mika.westerberg(a)linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/pinctrl/intel/pinctrl-cherryview.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) --- a/drivers/pinctrl/intel/pinctrl-cherryview.c +++ b/drivers/pinctrl/intel/pinctrl-cherryview.c @@ -1466,6 +1466,22 @@ static int chv_gpio_probe(struct chv_pin offset += range->npins; } + /* + * The same set of machines in chv_no_valid_mask[] have incorrectly + * configured GPIOs that generate spurious interrupts so we use + * this same list to apply another quirk for them. + * + * See also https://bugzilla.kernel.org/show_bug.cgi?id=197953. + */ + if (!need_valid_mask) { + /* + * Mask all interrupts the community is able to generate + * but leave the ones that can only generate GPEs unmasked. + */ + chv_writel(GENMASK(31, pctrl->community->nirqs), + pctrl->regs + CHV_INTMASK); + } + /* Clear all interrupts */ chv_writel(0xffff, pctrl->regs + CHV_INTSTAT); Patches currently in stable-queue which might be from mika.westerberg(a)linux.intel.com are queue-4.4/pinctrl-cherryview-mask-all-interrupts-on-intel_strago-based-systems.patch

7 years, 8 months

2
1
0 0

[Linux-stable-mirror] Patch "x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 4831b779403a836158917d59a7ca880483c67378 Mon Sep 17 00:00:00 2001 From: Andy Lutomirski <luto(a)kernel.org> Date: Sun, 10 Dec 2017 22:47:20 -0800 Subject: x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode From: Andy Lutomirski <luto(a)kernel.org> commit 4831b779403a836158917d59a7ca880483c67378 upstream. If something goes wrong with pagetable setup, vsyscall=native will accidentally fall back to emulation. Make it warn and fail so that we notice. Signed-off-by: Andy Lutomirski <luto(a)kernel.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Kees Cook <keescook(a)chromium.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/entry/vsyscall/vsyscall_64.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/arch/x86/entry/vsyscall/vsyscall_64.c +++ b/arch/x86/entry/vsyscall/vsyscall_64.c @@ -139,6 +139,10 @@ bool emulate_vsyscall(struct pt_regs *re WARN_ON_ONCE(address != regs->ip); + /* This should be unreachable in NATIVE mode. */ + if (WARN_ON(vsyscall_mode == NATIVE)) + return false; + if (vsyscall_mode == NONE) { warn_bad_vsyscall(KERN_INFO, regs, "vsyscall attempted with vsyscall=none"); Patches currently in stable-queue which might be from luto(a)kernel.org are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/vsyscall/64: Explicitly set _PAGE_USER in the pagetable hierarchy" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/vsyscall/64: Explicitly set _PAGE_USER in the pagetable hierarchy to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 49275fef986abfb8b476e4708aaecc07e7d3e087 Mon Sep 17 00:00:00 2001 From: Andy Lutomirski <luto(a)kernel.org> Date: Sun, 10 Dec 2017 22:47:19 -0800 Subject: x86/vsyscall/64: Explicitly set _PAGE_USER in the pagetable hierarchy From: Andy Lutomirski <luto(a)kernel.org> commit 49275fef986abfb8b476e4708aaecc07e7d3e087 upstream. The kernel is very erratic as to which pagetables have _PAGE_USER set. The vsyscall page gets lucky: it seems that all of the relevant pagetables are among the apparently arbitrary ones that set _PAGE_USER. Rather than relying on chance, just explicitly set _PAGE_USER. This will let us clean up pagetable setup to stop setting _PAGE_USER. The added code can also be reused by pagetable isolation to manage the _PAGE_USER bit in the usermode tables. [ tglx: Folded paravirt fix from Juergen Gross ] Signed-off-by: Andy Lutomirski <luto(a)kernel.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Kees Cook <keescook(a)chromium.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/entry/vsyscall/vsyscall_64.c | 34 +++++++++++++++++++++++++++++++++- 1 file changed, 33 insertions(+), 1 deletion(-) --- a/arch/x86/entry/vsyscall/vsyscall_64.c +++ b/arch/x86/entry/vsyscall/vsyscall_64.c @@ -37,6 +37,7 @@ #include <asm/unistd.h> #include <asm/fixmap.h> #include <asm/traps.h> +#include <asm/paravirt.h> #define CREATE_TRACE_POINTS #include "vsyscall_trace.h" @@ -329,16 +330,47 @@ int in_gate_area_no_mm(unsigned long add return vsyscall_mode != NONE && (addr & PAGE_MASK) == VSYSCALL_ADDR; } +/* + * The VSYSCALL page is the only user-accessible page in the kernel address + * range. Normally, the kernel page tables can have _PAGE_USER clear, but + * the tables covering VSYSCALL_ADDR need _PAGE_USER set if vsyscalls + * are enabled. + * + * Some day we may create a "minimal" vsyscall mode in which we emulate + * vsyscalls but leave the page not present. If so, we skip calling + * this. + */ +static void __init set_vsyscall_pgtable_user_bits(void) +{ + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + + pgd = pgd_offset_k(VSYSCALL_ADDR); + set_pgd(pgd, __pgd(pgd_val(*pgd) | _PAGE_USER)); + p4d = p4d_offset(pgd, VSYSCALL_ADDR); +#if CONFIG_PGTABLE_LEVELS >= 5 + p4d->p4d |= _PAGE_USER; +#endif + pud = pud_offset(p4d, VSYSCALL_ADDR); + set_pud(pud, __pud(pud_val(*pud) | _PAGE_USER)); + pmd = pmd_offset(pud, VSYSCALL_ADDR); + set_pmd(pmd, __pmd(pmd_val(*pmd) | _PAGE_USER)); +} + void __init map_vsyscall(void) { extern char __vsyscall_page; unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page); - if (vsyscall_mode != NONE) + if (vsyscall_mode != NONE) { __set_fixmap(VSYSCALL_PAGE, physaddr_vsyscall, vsyscall_mode == NATIVE ? PAGE_KERNEL_VSYSCALL : PAGE_KERNEL_VVAR); + set_vsyscall_pgtable_user_bits(); + } BUILD_BUG_ON((unsigned long)__fix_to_virt(VSYSCALL_PAGE) != (unsigned long)VSYSCALL_ADDR); Patches currently in stable-queue which might be from luto(a)kernel.org are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/uv: Use the right TLB-flush API" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/uv: Use the right TLB-flush API to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-uv-use-the-right-tlb-flush-api.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 3e46e0f5ee3643a1239be9046c7ba6c66ca2b329 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra <peterz(a)infradead.org> Date: Tue, 5 Dec 2017 13:34:50 +0100 Subject: x86/uv: Use the right TLB-flush API From: Peter Zijlstra <peterz(a)infradead.org> commit 3e46e0f5ee3643a1239be9046c7ba6c66ca2b329 upstream. Since uv_flush_tlb_others() implements flush_tlb_others() which is about flushing user mappings, we should use __flush_tlb_single(), which too is about flushing user mappings. Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Acked-by: Andrew Banman <abanman(a)hpe.com> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: Eduardo Valentin <eduval(a)amazon.com> Cc: Greg KH <gregkh(a)linuxfoundation.org> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Mike Travis <mike.travis(a)hpe.com> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Will Deacon <will.deacon(a)arm.com> Cc: aliguori(a)amazon.com Cc: daniel.gruss(a)iaik.tugraz.at Cc: hughd(a)google.com Cc: keescook(a)google.com Cc: linux-mm(a)kvack.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/platform/uv/tlb_uv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/x86/platform/uv/tlb_uv.c +++ b/arch/x86/platform/uv/tlb_uv.c @@ -299,7 +299,7 @@ static void bau_process_message(struct m local_flush_tlb(); stat->d_alltlb++; } else { - __flush_tlb_one(msg->address); + __flush_tlb_single(msg->address); stat->d_onetlb++; } stat->d_requestee++; Patches currently in stable-queue which might be from peterz(a)infradead.org are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/mm: Use __flush_tlb_one() for kernel memory" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/mm: Use __flush_tlb_one() for kernel memory to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-mm-use-__flush_tlb_one-for-kernel-memory.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From a501686b2923ce6f2ff2b1d0d50682c6411baf72 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra <peterz(a)infradead.org> Date: Tue, 5 Dec 2017 13:34:49 +0100 Subject: x86/mm: Use __flush_tlb_one() for kernel memory From: Peter Zijlstra <peterz(a)infradead.org> commit a501686b2923ce6f2ff2b1d0d50682c6411baf72 upstream. __flush_tlb_single() is for user mappings, __flush_tlb_one() for kernel mappings. Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: Eduardo Valentin <eduval(a)amazon.com> Cc: Greg KH <gregkh(a)linuxfoundation.org> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Will Deacon <will.deacon(a)arm.com> Cc: aliguori(a)amazon.com Cc: daniel.gruss(a)iaik.tugraz.at Cc: hughd(a)google.com Cc: keescook(a)google.com Cc: linux-mm(a)kvack.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/mm/tlb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -551,7 +551,7 @@ static void do_kernel_range_flush(void * /* flush range by one by one 'invlpg' */ for (addr = f->start; addr < f->end; addr += PAGE_SIZE) - __flush_tlb_single(addr); + __flush_tlb_one(addr); } void flush_tlb_kernel_range(unsigned long start, unsigned long end) Patches currently in stable-queue which might be from peterz(a)infradead.org are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/mm: Remove hard-coded ASID limit checks" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/mm: Remove hard-coded ASID limit checks to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-mm-remove-hard-coded-asid-limit-checks.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From cb0a9144a744e55207e24dcef812f05cd15a499a Mon Sep 17 00:00:00 2001 From: Dave Hansen <dave.hansen(a)linux.intel.com> Date: Mon, 4 Dec 2017 15:07:55 +0100 Subject: x86/mm: Remove hard-coded ASID limit checks From: Dave Hansen <dave.hansen(a)linux.intel.com> commit cb0a9144a744e55207e24dcef812f05cd15a499a upstream. First, it's nice to remove the magic numbers. Second, PAGE_TABLE_ISOLATION is going to consume half of the available ASID space. The space is currently unused, but add a comment to spell out this new restriction. Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: Eduardo Valentin <eduval(a)amazon.com> Cc: Greg KH <gregkh(a)linuxfoundation.org> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Will Deacon <will.deacon(a)arm.com> Cc: aliguori(a)amazon.com Cc: daniel.gruss(a)iaik.tugraz.at Cc: hughd(a)google.com Cc: keescook(a)google.com Cc: linux-mm(a)kvack.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/tlbflush.h | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -69,6 +69,22 @@ static inline u64 inc_mm_tlb_gen(struct return atomic64_inc_return(&mm->context.tlb_gen); } +/* There are 12 bits of space for ASIDS in CR3 */ +#define CR3_HW_ASID_BITS 12 +/* + * When enabled, PAGE_TABLE_ISOLATION consumes a single bit for + * user/kernel switches + */ +#define PTI_CONSUMED_ASID_BITS 0 + +#define CR3_AVAIL_ASID_BITS (CR3_HW_ASID_BITS - PTI_CONSUMED_ASID_BITS) +/* + * ASIDs are zero-based: 0->MAX_AVAIL_ASID are valid. -1 below to account + * for them being zero-based. Another -1 is because ASID 0 is reserved for + * use by non-PCID-aware users. + */ +#define MAX_ASID_AVAILABLE ((1 << CR3_AVAIL_ASID_BITS) - 2) + /* * If PCID is on, ASID-aware code paths put the ASID+1 into the PCID bits. * This serves two purposes. It prevents a nasty situation in which @@ -81,7 +97,7 @@ struct pgd_t; static inline unsigned long build_cr3(pgd_t *pgd, u16 asid) { if (static_cpu_has(X86_FEATURE_PCID)) { - VM_WARN_ON_ONCE(asid > 4094); + VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE); return __sme_pa(pgd) | (asid + 1); } else { VM_WARN_ON_ONCE(asid != 0); @@ -91,7 +107,7 @@ static inline unsigned long build_cr3(pg static inline unsigned long build_cr3_noflush(pgd_t *pgd, u16 asid) { - VM_WARN_ON_ONCE(asid > 4094); + VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE); return __sme_pa(pgd) | (asid + 1) | CR3_NOFLUSH; } Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/mm: Remove superfluous barriers" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/mm: Remove superfluous barriers to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-mm-remove-superfluous-barriers.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From b5fc6d943808b570bdfbec80f40c6b3855f1c48b Mon Sep 17 00:00:00 2001 From: Peter Zijlstra <peterz(a)infradead.org> Date: Tue, 5 Dec 2017 13:34:46 +0100 Subject: x86/mm: Remove superfluous barriers From: Peter Zijlstra <peterz(a)infradead.org> commit b5fc6d943808b570bdfbec80f40c6b3855f1c48b upstream. atomic64_inc_return() already implies smp_mb() before and after. Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: Eduardo Valentin <eduval(a)amazon.com> Cc: Greg KH <gregkh(a)linuxfoundation.org> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Will Deacon <will.deacon(a)arm.com> Cc: aliguori(a)amazon.com Cc: daniel.gruss(a)iaik.tugraz.at Cc: hughd(a)google.com Cc: keescook(a)google.com Cc: linux-mm(a)kvack.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/tlbflush.h | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -60,19 +60,13 @@ static inline void invpcid_flush_all_non static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) { - u64 new_tlb_gen; - /* * Bump the generation count. This also serves as a full barrier * that synchronizes with switch_mm(): callers are required to order * their read of mm_cpumask after their writes to the paging * structures. */ - smp_mb__before_atomic(); - new_tlb_gen = atomic64_inc_return(&mm->context.tlb_gen); - smp_mb__after_atomic(); - - return new_tlb_gen; + return atomic64_inc_return(&mm->context.tlb_gen); } #ifdef CONFIG_PARAVIRT Patches currently in stable-queue which might be from peterz(a)infradead.org are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/mm: Put MMU to hardware ASID translation in one place" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/mm: Put MMU to hardware ASID translation in one place to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From dd95f1a4b5ca904c78e6a097091eb21436478abb Mon Sep 17 00:00:00 2001 From: Dave Hansen <dave.hansen(a)linux.intel.com> Date: Mon, 4 Dec 2017 15:07:56 +0100 Subject: x86/mm: Put MMU to hardware ASID translation in one place From: Dave Hansen <dave.hansen(a)linux.intel.com> commit dd95f1a4b5ca904c78e6a097091eb21436478abb upstream. There are effectively two ASID types: 1. The one stored in the mmu_context that goes from 0..5 2. The one programmed into the hardware that goes from 1..6 This consolidates the locations where converting between the two (by doing a +1) to a single place which gives us a nice place to comment. PAGE_TABLE_ISOLATION will also need to, given an ASID, know which hardware ASID to flush for the userspace mapping. Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: Eduardo Valentin <eduval(a)amazon.com> Cc: Greg KH <gregkh(a)linuxfoundation.org> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Will Deacon <will.deacon(a)arm.com> Cc: aliguori(a)amazon.com Cc: daniel.gruss(a)iaik.tugraz.at Cc: hughd(a)google.com Cc: keescook(a)google.com Cc: linux-mm(a)kvack.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/tlbflush.h | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-) --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -85,20 +85,26 @@ static inline u64 inc_mm_tlb_gen(struct */ #define MAX_ASID_AVAILABLE ((1 << CR3_AVAIL_ASID_BITS) - 2) -/* - * If PCID is on, ASID-aware code paths put the ASID+1 into the PCID bits. - * This serves two purposes. It prevents a nasty situation in which - * PCID-unaware code saves CR3, loads some other value (with PCID == 0), - * and then restores CR3, thus corrupting the TLB for ASID 0 if the saved - * ASID was nonzero. It also means that any bugs involving loading a - * PCID-enabled CR3 with CR4.PCIDE off will trigger deterministically. - */ +static inline u16 kern_pcid(u16 asid) +{ + VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE); + /* + * If PCID is on, ASID-aware code paths put the ASID+1 into the + * PCID bits. This serves two purposes. It prevents a nasty + * situation in which PCID-unaware code saves CR3, loads some other + * value (with PCID == 0), and then restores CR3, thus corrupting + * the TLB for ASID 0 if the saved ASID was nonzero. It also means + * that any bugs involving loading a PCID-enabled CR3 with + * CR4.PCIDE off will trigger deterministically. + */ + return asid + 1; +} + struct pgd_t; static inline unsigned long build_cr3(pgd_t *pgd, u16 asid) { if (static_cpu_has(X86_FEATURE_PCID)) { - VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE); - return __sme_pa(pgd) | (asid + 1); + return __sme_pa(pgd) | kern_pcid(asid); } else { VM_WARN_ON_ONCE(asid != 0); return __sme_pa(pgd); @@ -108,7 +114,8 @@ static inline unsigned long build_cr3(pg static inline unsigned long build_cr3_noflush(pgd_t *pgd, u16 asid) { VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE); - return __sme_pa(pgd) | (asid + 1) | CR3_NOFLUSH; + VM_WARN_ON_ONCE(!this_cpu_has(X86_FEATURE_PCID)); + return __sme_pa(pgd) | kern_pcid(asid) | CR3_NOFLUSH; } #ifdef CONFIG_PARAVIRT Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/mm/dump_pagetables: Make the address hints correct and readable" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/mm/dump_pagetables: Make the address hints correct and readable to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 146122e24bdf208015d629babba673e28d090709 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner <tglx(a)linutronix.de> Date: Wed, 20 Dec 2017 18:07:42 +0100 Subject: x86/mm/dump_pagetables: Make the address hints correct and readable From: Thomas Gleixner <tglx(a)linutronix.de> commit 146122e24bdf208015d629babba673e28d090709 upstream. The address hints are a trainwreck. The array entry numbers have to kept magically in sync with the actual hints, which is doomed as some of the array members are initialized at runtime via the entry numbers. Designated initializers have been around before this code was implemented.... Use the entry numbers to populate the address hints array and add the missing bits and pieces. Split 32 and 64 bit for readability sake. Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: linux-kernel(a)vger.kernel.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/mm/dump_pagetables.c | 90 ++++++++++++++++++++++++------------------ 1 file changed, 53 insertions(+), 37 deletions(-) --- a/arch/x86/mm/dump_pagetables.c +++ b/arch/x86/mm/dump_pagetables.c @@ -44,10 +44,12 @@ struct addr_marker { unsigned long max_lines; }; -/* indices for address_markers; keep sync'd w/ address_markers below */ +/* Address space markers hints */ + +#ifdef CONFIG_X86_64 + enum address_markers_idx { USER_SPACE_NR = 0, -#ifdef CONFIG_X86_64 KERNEL_SPACE_NR, LOW_KERNEL_NR, VMALLOC_START_NR, @@ -56,56 +58,70 @@ enum address_markers_idx { KASAN_SHADOW_START_NR, KASAN_SHADOW_END_NR, #endif -# ifdef CONFIG_X86_ESPFIX64 +#ifdef CONFIG_X86_ESPFIX64 ESPFIX_START_NR, -# endif +#endif +#ifdef CONFIG_EFI + EFI_END_NR, +#endif HIGH_KERNEL_NR, MODULES_VADDR_NR, MODULES_END_NR, -#else + FIXADDR_START_NR, + END_OF_SPACE_NR, +}; + +static struct addr_marker address_markers[] = { + [USER_SPACE_NR] = { 0, "User Space" }, + [KERNEL_SPACE_NR] = { (1UL << 63), "Kernel Space" }, + [LOW_KERNEL_NR] = { 0UL, "Low Kernel Mapping" }, + [VMALLOC_START_NR] = { 0UL, "vmalloc() Area" }, + [VMEMMAP_START_NR] = { 0UL, "Vmemmap" }, +#ifdef CONFIG_KASAN + [KASAN_SHADOW_START_NR] = { KASAN_SHADOW_START, "KASAN shadow" }, + [KASAN_SHADOW_END_NR] = { KASAN_SHADOW_END, "KASAN shadow end" }, +#endif +#ifdef CONFIG_X86_ESPFIX64 + [ESPFIX_START_NR] = { ESPFIX_BASE_ADDR, "ESPfix Area", 16 }, +#endif +#ifdef CONFIG_EFI + [EFI_END_NR] = { EFI_VA_END, "EFI Runtime Services" }, +#endif + [HIGH_KERNEL_NR] = { __START_KERNEL_map, "High Kernel Mapping" }, + [MODULES_VADDR_NR] = { MODULES_VADDR, "Modules" }, + [MODULES_END_NR] = { MODULES_END, "End Modules" }, + [FIXADDR_START_NR] = { FIXADDR_START, "Fixmap Area" }, + [END_OF_SPACE_NR] = { -1, NULL } +}; + +#else /* CONFIG_X86_64 */ + +enum address_markers_idx { + USER_SPACE_NR = 0, KERNEL_SPACE_NR, VMALLOC_START_NR, VMALLOC_END_NR, -# ifdef CONFIG_HIGHMEM +#ifdef CONFIG_HIGHMEM PKMAP_BASE_NR, -# endif - FIXADDR_START_NR, #endif + FIXADDR_START_NR, + END_OF_SPACE_NR, }; -/* Address space markers hints */ static struct addr_marker address_markers[] = { - { 0, "User Space" }, -#ifdef CONFIG_X86_64 - { 0x8000000000000000UL, "Kernel Space" }, - { 0/* PAGE_OFFSET */, "Low Kernel Mapping" }, - { 0/* VMALLOC_START */, "vmalloc() Area" }, - { 0/* VMEMMAP_START */, "Vmemmap" }, -#ifdef CONFIG_KASAN - { KASAN_SHADOW_START, "KASAN shadow" }, - { KASAN_SHADOW_END, "KASAN shadow end" }, -#endif -# ifdef CONFIG_X86_ESPFIX64 - { ESPFIX_BASE_ADDR, "ESPfix Area", 16 }, -# endif -# ifdef CONFIG_EFI - { EFI_VA_END, "EFI Runtime Services" }, -# endif - { __START_KERNEL_map, "High Kernel Mapping" }, - { MODULES_VADDR, "Modules" }, - { MODULES_END, "End Modules" }, -#else - { PAGE_OFFSET, "Kernel Mapping" }, - { 0/* VMALLOC_START */, "vmalloc() Area" }, - { 0/*VMALLOC_END*/, "vmalloc() End" }, -# ifdef CONFIG_HIGHMEM - { 0/*PKMAP_BASE*/, "Persistent kmap() Area" }, -# endif - { 0/*FIXADDR_START*/, "Fixmap Area" }, + [USER_SPACE_NR] = { 0, "User Space" }, + [KERNEL_SPACE_NR] = { PAGE_OFFSET, "Kernel Mapping" }, + [VMALLOC_START_NR] = { 0UL, "vmalloc() Area" }, + [VMALLOC_END_NR] = { 0UL, "vmalloc() End" }, +#ifdef CONFIG_HIGHMEM + [PKMAP_BASE_NR] = { 0UL, "Persistent kmap() Area" }, #endif - { -1, NULL } /* End of list */ + [FIXADDR_START_NR] = { 0UL, "Fixmap area" }, + [END_OF_SPACE_NR] = { -1, NULL } }; +#endif /* !CONFIG_X86_64 */ + /* Multipliers for offsets within the PTEs */ #define PTE_LEVEL_MULT (PAGE_SIZE) #define PMD_LEVEL_MULT (PTRS_PER_PTE * PTE_LEVEL_MULT) Patches currently in stable-queue which might be from tglx(a)linutronix.de are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-cpu_entry_area-prevent-wraparound-in-setup_cpu_entry_area_ptes-on-32bit.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/pci-pm-force-devices-to-d0-in-pci_pm_thaw_noirq.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/mm: Move the CR3 construction functions to tlbflush.h" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/mm: Move the CR3 construction functions to tlbflush.h to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 50fb83a62cf472dc53ba23bd3f7bd6c1b2b3b53e Mon Sep 17 00:00:00 2001 From: Dave Hansen <dave.hansen(a)linux.intel.com> Date: Mon, 4 Dec 2017 15:07:54 +0100 Subject: x86/mm: Move the CR3 construction functions to tlbflush.h From: Dave Hansen <dave.hansen(a)linux.intel.com> commit 50fb83a62cf472dc53ba23bd3f7bd6c1b2b3b53e upstream. For flushing the TLB, the ASID which has been programmed into the hardware must be known. That differs from what is in 'cpu_tlbstate'. Add functions to transform the 'cpu_tlbstate' values into to the one programmed into the hardware (CR3). It's not easy to include mmu_context.h into tlbflush.h, so just move the CR3 building over to tlbflush.h. Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: Eduardo Valentin <eduval(a)amazon.com> Cc: Greg KH <gregkh(a)linuxfoundation.org> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Will Deacon <will.deacon(a)arm.com> Cc: aliguori(a)amazon.com Cc: daniel.gruss(a)iaik.tugraz.at Cc: hughd(a)google.com Cc: keescook(a)google.com Cc: linux-mm(a)kvack.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/mmu_context.h | 29 +---------------------------- arch/x86/include/asm/tlbflush.h | 26 ++++++++++++++++++++++++++ arch/x86/mm/tlb.c | 8 ++++---- 3 files changed, 31 insertions(+), 32 deletions(-) --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -291,33 +291,6 @@ static inline bool arch_vma_access_permi } /* - * If PCID is on, ASID-aware code paths put the ASID+1 into the PCID - * bits. This serves two purposes. It prevents a nasty situation in - * which PCID-unaware code saves CR3, loads some other value (with PCID - * == 0), and then restores CR3, thus corrupting the TLB for ASID 0 if - * the saved ASID was nonzero. It also means that any bugs involving - * loading a PCID-enabled CR3 with CR4.PCIDE off will trigger - * deterministically. - */ - -static inline unsigned long build_cr3(struct mm_struct *mm, u16 asid) -{ - if (static_cpu_has(X86_FEATURE_PCID)) { - VM_WARN_ON_ONCE(asid > 4094); - return __sme_pa(mm->pgd) | (asid + 1); - } else { - VM_WARN_ON_ONCE(asid != 0); - return __sme_pa(mm->pgd); - } -} - -static inline unsigned long build_cr3_noflush(struct mm_struct *mm, u16 asid) -{ - VM_WARN_ON_ONCE(asid > 4094); - return __sme_pa(mm->pgd) | (asid + 1) | CR3_NOFLUSH; -} - -/* * This can be used from process context to figure out what the value of * CR3 is without needing to do a (slow) __read_cr3(). * @@ -326,7 +299,7 @@ static inline unsigned long build_cr3_no */ static inline unsigned long __get_current_cr3_fast(void) { - unsigned long cr3 = build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm), + unsigned long cr3 = build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, this_cpu_read(cpu_tlbstate.loaded_mm_asid)); /* For now, be very restrictive about when this can be called. */ --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -69,6 +69,32 @@ static inline u64 inc_mm_tlb_gen(struct return atomic64_inc_return(&mm->context.tlb_gen); } +/* + * If PCID is on, ASID-aware code paths put the ASID+1 into the PCID bits. + * This serves two purposes. It prevents a nasty situation in which + * PCID-unaware code saves CR3, loads some other value (with PCID == 0), + * and then restores CR3, thus corrupting the TLB for ASID 0 if the saved + * ASID was nonzero. It also means that any bugs involving loading a + * PCID-enabled CR3 with CR4.PCIDE off will trigger deterministically. + */ +struct pgd_t; +static inline unsigned long build_cr3(pgd_t *pgd, u16 asid) +{ + if (static_cpu_has(X86_FEATURE_PCID)) { + VM_WARN_ON_ONCE(asid > 4094); + return __sme_pa(pgd) | (asid + 1); + } else { + VM_WARN_ON_ONCE(asid != 0); + return __sme_pa(pgd); + } +} + +static inline unsigned long build_cr3_noflush(pgd_t *pgd, u16 asid) +{ + VM_WARN_ON_ONCE(asid > 4094); + return __sme_pa(pgd) | (asid + 1) | CR3_NOFLUSH; +} + #ifdef CONFIG_PARAVIRT #include <asm/paravirt.h> #else --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -128,7 +128,7 @@ void switch_mm_irqs_off(struct mm_struct * isn't free. */ #ifdef CONFIG_DEBUG_VM - if (WARN_ON_ONCE(__read_cr3() != build_cr3(real_prev, prev_asid))) { + if (WARN_ON_ONCE(__read_cr3() != build_cr3(real_prev->pgd, prev_asid))) { /* * If we were to BUG here, we'd be very likely to kill * the system so hard that we don't see the call trace. @@ -195,7 +195,7 @@ void switch_mm_irqs_off(struct mm_struct if (need_flush) { this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id); this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen); - write_cr3(build_cr3(next, new_asid)); + write_cr3(build_cr3(next->pgd, new_asid)); /* * NB: This gets called via leave_mm() in the idle path @@ -208,7 +208,7 @@ void switch_mm_irqs_off(struct mm_struct trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL); } else { /* The new ASID is already up to date. */ - write_cr3(build_cr3_noflush(next, new_asid)); + write_cr3(build_cr3_noflush(next->pgd, new_asid)); /* See above wrt _rcuidle. */ trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 0); @@ -288,7 +288,7 @@ void initialize_tlbstate_and_flush(void) !(cr4_read_shadow() & X86_CR4_PCIDE)); /* Force ASID 0 and force a TLB flush. */ - write_cr3(build_cr3(mm, 0)); + write_cr3(build_cr3(mm->pgd, 0)); /* Reinitialize tlbstate. */ this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0); Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/mm: Create asm/invpcid.h" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/mm: Create asm/invpcid.h to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-mm-create-asm-invpcid.h.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 1a3b0caeb77edeac5ce5fa05e6a61c474c9a9745 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra <peterz(a)infradead.org> Date: Tue, 5 Dec 2017 13:34:47 +0100 Subject: x86/mm: Create asm/invpcid.h From: Peter Zijlstra <peterz(a)infradead.org> commit 1a3b0caeb77edeac5ce5fa05e6a61c474c9a9745 upstream. Unclutter tlbflush.h a little. Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: Eduardo Valentin <eduval(a)amazon.com> Cc: Greg KH <gregkh(a)linuxfoundation.org> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: Will Deacon <will.deacon(a)arm.com> Cc: aliguori(a)amazon.com Cc: daniel.gruss(a)iaik.tugraz.at Cc: hughd(a)google.com Cc: keescook(a)google.com Cc: linux-mm(a)kvack.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/invpcid.h | 53 ++++++++++++++++++++++++++++++++++++++++ arch/x86/include/asm/tlbflush.h | 49 ------------------------------------ 2 files changed, 54 insertions(+), 48 deletions(-) --- /dev/null +++ b/arch/x86/include/asm/invpcid.h @@ -0,0 +1,53 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_INVPCID +#define _ASM_X86_INVPCID + +static inline void __invpcid(unsigned long pcid, unsigned long addr, + unsigned long type) +{ + struct { u64 d[2]; } desc = { { pcid, addr } }; + + /* + * The memory clobber is because the whole point is to invalidate + * stale TLB entries and, especially if we're flushing global + * mappings, we don't want the compiler to reorder any subsequent + * memory accesses before the TLB flush. + * + * The hex opcode is invpcid (%ecx), %eax in 32-bit mode and + * invpcid (%rcx), %rax in long mode. + */ + asm volatile (".byte 0x66, 0x0f, 0x38, 0x82, 0x01" + : : "m" (desc), "a" (type), "c" (&desc) : "memory"); +} + +#define INVPCID_TYPE_INDIV_ADDR 0 +#define INVPCID_TYPE_SINGLE_CTXT 1 +#define INVPCID_TYPE_ALL_INCL_GLOBAL 2 +#define INVPCID_TYPE_ALL_NON_GLOBAL 3 + +/* Flush all mappings for a given pcid and addr, not including globals. */ +static inline void invpcid_flush_one(unsigned long pcid, + unsigned long addr) +{ + __invpcid(pcid, addr, INVPCID_TYPE_INDIV_ADDR); +} + +/* Flush all mappings for a given PCID, not including globals. */ +static inline void invpcid_flush_single_context(unsigned long pcid) +{ + __invpcid(pcid, 0, INVPCID_TYPE_SINGLE_CTXT); +} + +/* Flush all mappings, including globals, for all PCIDs. */ +static inline void invpcid_flush_all(void) +{ + __invpcid(0, 0, INVPCID_TYPE_ALL_INCL_GLOBAL); +} + +/* Flush all mappings for all PCIDs except globals. */ +static inline void invpcid_flush_all_nonglobals(void) +{ + __invpcid(0, 0, INVPCID_TYPE_ALL_NON_GLOBAL); +} + +#endif /* _ASM_X86_INVPCID */ --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -9,54 +9,7 @@ #include <asm/cpufeature.h> #include <asm/special_insns.h> #include <asm/smp.h> - -static inline void __invpcid(unsigned long pcid, unsigned long addr, - unsigned long type) -{ - struct { u64 d[2]; } desc = { { pcid, addr } }; - - /* - * The memory clobber is because the whole point is to invalidate - * stale TLB entries and, especially if we're flushing global - * mappings, we don't want the compiler to reorder any subsequent - * memory accesses before the TLB flush. - * - * The hex opcode is invpcid (%ecx), %eax in 32-bit mode and - * invpcid (%rcx), %rax in long mode. - */ - asm volatile (".byte 0x66, 0x0f, 0x38, 0x82, 0x01" - : : "m" (desc), "a" (type), "c" (&desc) : "memory"); -} - -#define INVPCID_TYPE_INDIV_ADDR 0 -#define INVPCID_TYPE_SINGLE_CTXT 1 -#define INVPCID_TYPE_ALL_INCL_GLOBAL 2 -#define INVPCID_TYPE_ALL_NON_GLOBAL 3 - -/* Flush all mappings for a given pcid and addr, not including globals. */ -static inline void invpcid_flush_one(unsigned long pcid, - unsigned long addr) -{ - __invpcid(pcid, addr, INVPCID_TYPE_INDIV_ADDR); -} - -/* Flush all mappings for a given PCID, not including globals. */ -static inline void invpcid_flush_single_context(unsigned long pcid) -{ - __invpcid(pcid, 0, INVPCID_TYPE_SINGLE_CTXT); -} - -/* Flush all mappings, including globals, for all PCIDs. */ -static inline void invpcid_flush_all(void) -{ - __invpcid(0, 0, INVPCID_TYPE_ALL_INCL_GLOBAL); -} - -/* Flush all mappings for all PCIDs except globals. */ -static inline void invpcid_flush_all_nonglobals(void) -{ - __invpcid(0, 0, INVPCID_TYPE_ALL_NON_GLOBAL); -} +#include <asm/invpcid.h> static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) { Patches currently in stable-queue which might be from peterz(a)infradead.org are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/mm/dump_pagetables: Check PAGE_PRESENT for real" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/mm/dump_pagetables: Check PAGE_PRESENT for real to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-mm-dump_pagetables-check-page_present-for-real.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From c05344947b37f7cda726e802457370bc6eac4d26 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner <tglx(a)linutronix.de> Date: Sat, 16 Dec 2017 01:14:39 +0100 Subject: x86/mm/dump_pagetables: Check PAGE_PRESENT for real From: Thomas Gleixner <tglx(a)linutronix.de> commit c05344947b37f7cda726e802457370bc6eac4d26 upstream. The check for a present page in printk_prot(): if (!pgprot_val(prot)) { /* Not present */ is bogus. If a PTE is set to PAGE_NONE then the pgprot_val is not zero and the entry is decoded in bogus ways, e.g. as RX GLB. That is confusing when analyzing mapping correctness. Check for the present bit to make an informed decision. Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: linux-kernel(a)vger.kernel.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/mm/dump_pagetables.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/x86/mm/dump_pagetables.c +++ b/arch/x86/mm/dump_pagetables.c @@ -140,7 +140,7 @@ static void printk_prot(struct seq_file static const char * const level_name[] = { "cr3", "pgd", "p4d", "pud", "pmd", "pte" }; - if (!pgprot_val(prot)) { + if (!(pr & _PAGE_PRESENT)) { /* Not present */ pt_dump_cont_printf(m, dmsg, " "); } else { Patches currently in stable-queue which might be from tglx(a)linutronix.de are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-cpu_entry_area-prevent-wraparound-in-setup_cpu_entry_area_ptes-on-32bit.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/pci-pm-force-devices-to-d0-in-pci_pm_thaw_noirq.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/mm: Add comments to clarify which TLB-flush functions are supposed to flush what" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/mm: Add comments to clarify which TLB-flush functions are supposed to flush what to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 3f67af51e56f291d7417d77c4f67cd774633c5e1 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra <peterz(a)infradead.org> Date: Tue, 5 Dec 2017 13:34:52 +0100 Subject: x86/mm: Add comments to clarify which TLB-flush functions are supposed to flush what From: Peter Zijlstra <peterz(a)infradead.org> commit 3f67af51e56f291d7417d77c4f67cd774633c5e1 upstream. Per popular request.. Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: Eduardo Valentin <eduval(a)amazon.com> Cc: Greg KH <gregkh(a)linuxfoundation.org> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Will Deacon <will.deacon(a)arm.com> Cc: aliguori(a)amazon.com Cc: daniel.gruss(a)iaik.tugraz.at Cc: hughd(a)google.com Cc: keescook(a)google.com Cc: linux-mm(a)kvack.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/tlbflush.h | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -228,6 +228,9 @@ static inline void cr4_set_bits_and_upda extern void initialize_tlbstate_and_flush(void); +/* + * flush the entire current user mapping + */ static inline void __native_flush_tlb(void) { /* @@ -240,6 +243,9 @@ static inline void __native_flush_tlb(vo preempt_enable(); } +/* + * flush everything + */ static inline void __native_flush_tlb_global(void) { unsigned long cr4, flags; @@ -269,17 +275,27 @@ static inline void __native_flush_tlb_gl raw_local_irq_restore(flags); } +/* + * flush one page in the user mapping + */ static inline void __native_flush_tlb_single(unsigned long addr) { asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); } +/* + * flush everything + */ static inline void __flush_tlb_all(void) { - if (boot_cpu_has(X86_FEATURE_PGE)) + if (boot_cpu_has(X86_FEATURE_PGE)) { __flush_tlb_global(); - else + } else { + /* + * !PGE -> !PCID (setup_pcid()), thus every flush is total. + */ __flush_tlb(); + } /* * Note: if we somehow had PCID but not PGE, then this wouldn't work -- @@ -290,6 +306,9 @@ static inline void __flush_tlb_all(void) */ } +/* + * flush one page in the kernel mapping + */ static inline void __flush_tlb_one(unsigned long addr) { count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ONE); Patches currently in stable-queue which might be from peterz(a)infradead.org are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/microcode: Dont abuse the TLB-flush interface" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/microcode: Dont abuse the TLB-flush interface to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-microcode-dont-abuse-the-tlb-flush-interface.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 23cb7d46f371844c004784ad9552a57446f73e5a Mon Sep 17 00:00:00 2001 From: Peter Zijlstra <peterz(a)infradead.org> Date: Tue, 5 Dec 2017 13:34:51 +0100 Subject: x86/microcode: Dont abuse the TLB-flush interface From: Peter Zijlstra <peterz(a)infradead.org> commit 23cb7d46f371844c004784ad9552a57446f73e5a upstream. Commit: ec400ddeff20 ("x86/microcode_intel_early.c: Early update ucode on Intel's CPU") ... grubbed into tlbflush internals without coherent explanation. Since it says its a precaution and the SDM doesn't mention anything like this, take it out back. Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: Eduardo Valentin <eduval(a)amazon.com> Cc: Greg KH <gregkh(a)linuxfoundation.org> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Will Deacon <will.deacon(a)arm.com> Cc: aliguori(a)amazon.com Cc: daniel.gruss(a)iaik.tugraz.at Cc: fenghua.yu(a)intel.com Cc: hughd(a)google.com Cc: keescook(a)google.com Cc: linux-mm(a)kvack.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/tlbflush.h | 19 ++++++------------- arch/x86/kernel/cpu/microcode/intel.c | 13 ------------- 2 files changed, 6 insertions(+), 26 deletions(-) --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -246,20 +246,9 @@ static inline void __native_flush_tlb(vo preempt_enable(); } -static inline void __native_flush_tlb_global_irq_disabled(void) -{ - unsigned long cr4; - - cr4 = this_cpu_read(cpu_tlbstate.cr4); - /* clear PGE */ - native_write_cr4(cr4 & ~X86_CR4_PGE); - /* write old PGE again and flush TLBs */ - native_write_cr4(cr4); -} - static inline void __native_flush_tlb_global(void) { - unsigned long flags; + unsigned long cr4, flags; if (static_cpu_has(X86_FEATURE_INVPCID)) { /* @@ -277,7 +266,11 @@ static inline void __native_flush_tlb_gl */ raw_local_irq_save(flags); - __native_flush_tlb_global_irq_disabled(); + cr4 = this_cpu_read(cpu_tlbstate.cr4); + /* toggle PGE */ + native_write_cr4(cr4 ^ X86_CR4_PGE); + /* write old PGE again and flush TLBs */ + native_write_cr4(cr4); raw_local_irq_restore(flags); } --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -565,15 +565,6 @@ static void print_ucode(struct ucode_cpu } #else -/* - * Flush global tlb. We only do this in x86_64 where paging has been enabled - * already and PGE should be enabled as well. - */ -static inline void flush_tlb_early(void) -{ - __native_flush_tlb_global_irq_disabled(); -} - static inline void print_ucode(struct ucode_cpu_info *uci) { struct microcode_intel *mc; @@ -602,10 +593,6 @@ static int apply_microcode_early(struct if (rev != mc->hdr.rev) return -1; -#ifdef CONFIG_X86_64 - /* Flush global tlb. This is precaution. */ - flush_tlb_early(); -#endif uci->cpu_sig.rev = rev; if (early) Patches currently in stable-queue which might be from peterz(a)infradead.org are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/mm/64: Improve the memory map documentation" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/mm/64: Improve the memory map documentation to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-mm-64-improve-the-memory-map-documentation.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 5a7ccf4754fb3660569a6de52ba7f7fc3dfaf280 Mon Sep 17 00:00:00 2001 From: Andy Lutomirski <luto(a)kernel.org> Date: Tue, 12 Dec 2017 07:56:43 -0800 Subject: x86/mm/64: Improve the memory map documentation From: Andy Lutomirski <luto(a)kernel.org> commit 5a7ccf4754fb3660569a6de52ba7f7fc3dfaf280 upstream. The old docs had the vsyscall range wrong and were missing the fixmap. Fix both. There used to be 8 MB reserved for future vsyscalls, but that's long gone. Signed-off-by: Andy Lutomirski <luto(a)kernel.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Kees Cook <keescook(a)chromium.org> Cc: Kirill A. Shutemov <kirill(a)shutemov.name> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- Documentation/x86/x86_64/mm.txt | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) --- a/Documentation/x86/x86_64/mm.txt +++ b/Documentation/x86/x86_64/mm.txt @@ -19,8 +19,9 @@ ffffff0000000000 - ffffff7fffffffff (=39 ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space ... unused hole ... ffffffff80000000 - ffffffff9fffffff (=512 MB) kernel text mapping, from phys 0 -ffffffffa0000000 - ffffffffff5fffff (=1526 MB) module mapping space (variable) -ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls +ffffffffa0000000 - [fixmap start] (~1526 MB) module mapping space (variable) +[fixmap start] - ffffffffff5fffff kernel-internal fixmap range +ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole Virtual memory map with 5 level page tables: @@ -41,8 +42,9 @@ ffffff0000000000 - ffffff7fffffffff (=39 ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space ... unused hole ... ffffffff80000000 - ffffffff9fffffff (=512 MB) kernel text mapping, from phys 0 -ffffffffa0000000 - ffffffffff5fffff (=1526 MB) module mapping space -ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls +ffffffffa0000000 - [fixmap start] (~1526 MB) module mapping space +[fixmap start] - ffffffffff5fffff kernel-internal fixmap range +ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole Architecture defines a 64-bit virtual address. Implementations can support Patches currently in stable-queue which might be from luto(a)kernel.org are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/ldt: Rework locking" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/ldt: Rework locking to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-ldt-rework-locking.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From c2b3496bb30bd159e9de42e5c952e1f1f33c9a77 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra <peterz(a)infradead.org> Date: Thu, 14 Dec 2017 12:27:30 +0100 Subject: x86/ldt: Rework locking From: Peter Zijlstra <peterz(a)infradead.org> commit c2b3496bb30bd159e9de42e5c952e1f1f33c9a77 upstream. The LDT is duplicated on fork() and on exec(), which is wrong as exec() should start from a clean state, i.e. without LDT. To fix this the LDT duplication code will be moved into arch_dup_mmap() which is only called for fork(). This introduces a locking problem. arch_dup_mmap() holds mmap_sem of the parent process, but the LDT duplication code needs to acquire mm->context.lock to access the LDT data safely, which is the reverse lock order of write_ldt() where mmap_sem nests into context.lock. Solve this by introducing a new rw semaphore which serializes the read/write_ldt() syscall operations and use context.lock to protect the actual installment of the LDT descriptor. So context.lock stabilizes mm->context.ldt and can nest inside of the new semaphore or mmap_sem. Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Andy Lutomirsky <luto(a)kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Borislav Petkov <bpetkov(a)suse.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: Eduardo Valentin <eduval(a)amazon.com> Cc: Greg KH <gregkh(a)linuxfoundation.org> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Will Deacon <will.deacon(a)arm.com> Cc: aliguori(a)amazon.com Cc: dan.j.williams(a)intel.com Cc: hughd(a)google.com Cc: keescook(a)google.com Cc: kirill.shutemov(a)linux.intel.com Cc: linux-mm(a)kvack.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/mmu.h | 4 +++- arch/x86/include/asm/mmu_context.h | 2 ++ arch/x86/kernel/ldt.c | 33 +++++++++++++++++++++------------ 3 files changed, 26 insertions(+), 13 deletions(-) --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -3,6 +3,7 @@ #define _ASM_X86_MMU_H #include <linux/spinlock.h> +#include <linux/rwsem.h> #include <linux/mutex.h> #include <linux/atomic.h> @@ -27,7 +28,8 @@ typedef struct { atomic64_t tlb_gen; #ifdef CONFIG_MODIFY_LDT_SYSCALL - struct ldt_struct *ldt; + struct rw_semaphore ldt_usr_sem; + struct ldt_struct *ldt; #endif #ifdef CONFIG_X86_64 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -132,6 +132,8 @@ void enter_lazy_tlb(struct mm_struct *mm static inline int init_new_context(struct task_struct *tsk, struct mm_struct *mm) { + mutex_init(&mm->context.lock); + mm->context.ctx_id = atomic64_inc_return(&last_mm_ctx_id); atomic64_set(&mm->context.tlb_gen, 0); --- a/arch/x86/kernel/ldt.c +++ b/arch/x86/kernel/ldt.c @@ -5,6 +5,11 @@ * Copyright (C) 2002 Andi Kleen * * This handles calls from both 32bit and 64bit mode. + * + * Lock order: + * contex.ldt_usr_sem + * mmap_sem + * context.lock */ #include <linux/errno.h> @@ -42,7 +47,7 @@ static void refresh_ldt_segments(void) #endif } -/* context.lock is held for us, so we don't need any locking. */ +/* context.lock is held by the task which issued the smp function call */ static void flush_ldt(void *__mm) { struct mm_struct *mm = __mm; @@ -99,15 +104,17 @@ static void finalize_ldt_struct(struct l paravirt_alloc_ldt(ldt->entries, ldt->nr_entries); } -/* context.lock is held */ -static void install_ldt(struct mm_struct *current_mm, - struct ldt_struct *ldt) +static void install_ldt(struct mm_struct *mm, struct ldt_struct *ldt) { + mutex_lock(&mm->context.lock); + /* Synchronizes with READ_ONCE in load_mm_ldt. */ - smp_store_release(&current_mm->context.ldt, ldt); + smp_store_release(&mm->context.ldt, ldt); - /* Activate the LDT for all CPUs using current_mm. */ - on_each_cpu_mask(mm_cpumask(current_mm), flush_ldt, current_mm, true); + /* Activate the LDT for all CPUs using currents mm. */ + on_each_cpu_mask(mm_cpumask(mm), flush_ldt, mm, true); + + mutex_unlock(&mm->context.lock); } static void free_ldt_struct(struct ldt_struct *ldt) @@ -133,7 +140,8 @@ int init_new_context_ldt(struct task_str struct mm_struct *old_mm; int retval = 0; - mutex_init(&mm->context.lock); + init_rwsem(&mm->context.ldt_usr_sem); + old_mm = current->mm; if (!old_mm) { mm->context.ldt = NULL; @@ -180,7 +188,7 @@ static int read_ldt(void __user *ptr, un unsigned long entries_size; int retval; - mutex_lock(&mm->context.lock); + down_read(&mm->context.ldt_usr_sem); if (!mm->context.ldt) { retval = 0; @@ -209,7 +217,7 @@ static int read_ldt(void __user *ptr, un retval = bytecount; out_unlock: - mutex_unlock(&mm->context.lock); + up_read(&mm->context.ldt_usr_sem); return retval; } @@ -269,7 +277,8 @@ static int write_ldt(void __user *ptr, u ldt.avl = 0; } - mutex_lock(&mm->context.lock); + if (down_write_killable(&mm->context.ldt_usr_sem)) + return -EINTR; old_ldt = mm->context.ldt; old_nr_entries = old_ldt ? old_ldt->nr_entries : 0; @@ -291,7 +300,7 @@ static int write_ldt(void __user *ptr, u error = 0; out_unlock: - mutex_unlock(&mm->context.lock); + up_write(&mm->context.ldt_usr_sem); out: return error; } Patches currently in stable-queue which might be from peterz(a)infradead.org are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/ldt: Prevent LDT inheritance on exec" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/ldt: Prevent LDT inheritance on exec to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-ldt-prevent-ldt-inheritance-on-exec.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From a4828f81037f491b2cc986595e3a969a6eeb2fb5 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner <tglx(a)linutronix.de> Date: Thu, 14 Dec 2017 12:27:31 +0100 Subject: x86/ldt: Prevent LDT inheritance on exec From: Thomas Gleixner <tglx(a)linutronix.de> commit a4828f81037f491b2cc986595e3a969a6eeb2fb5 upstream. The LDT is inherited across fork() or exec(), but that makes no sense at all because exec() is supposed to start the process clean. The reason why this happens is that init_new_context_ldt() is called from init_new_context() which obviously needs to be called for both fork() and exec(). It would be surprising if anything relies on that behaviour, so it seems to be safe to remove that misfeature. Split the context initialization into two parts. Clear the LDT pointer and initialize the mutex from the general context init and move the LDT duplication to arch_dup_mmap() which is only called on fork(). Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Signed-off-by: Peter Zijlstra <peterz(a)infradead.org> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Andy Lutomirsky <luto(a)kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Borislav Petkov <bpetkov(a)suse.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)intel.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: Eduardo Valentin <eduval(a)amazon.com> Cc: Greg KH <gregkh(a)linuxfoundation.org> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Will Deacon <will.deacon(a)arm.com> Cc: aliguori(a)amazon.com Cc: dan.j.williams(a)intel.com Cc: hughd(a)google.com Cc: keescook(a)google.com Cc: kirill.shutemov(a)linux.intel.com Cc: linux-mm(a)kvack.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/mmu_context.h | 21 ++++++++++++++------- arch/x86/kernel/ldt.c | 18 +++++------------- tools/testing/selftests/x86/ldt_gdt.c | 9 +++------ 3 files changed, 22 insertions(+), 26 deletions(-) --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -57,11 +57,17 @@ struct ldt_struct { /* * Used for LDT copy/destruction. */ -int init_new_context_ldt(struct task_struct *tsk, struct mm_struct *mm); +static inline void init_new_context_ldt(struct mm_struct *mm) +{ + mm->context.ldt = NULL; + init_rwsem(&mm->context.ldt_usr_sem); +} +int ldt_dup_context(struct mm_struct *oldmm, struct mm_struct *mm); void destroy_context_ldt(struct mm_struct *mm); #else /* CONFIG_MODIFY_LDT_SYSCALL */ -static inline int init_new_context_ldt(struct task_struct *tsk, - struct mm_struct *mm) +static inline void init_new_context_ldt(struct mm_struct *mm) { } +static inline int ldt_dup_context(struct mm_struct *oldmm, + struct mm_struct *mm) { return 0; } @@ -137,15 +143,16 @@ static inline int init_new_context(struc mm->context.ctx_id = atomic64_inc_return(&last_mm_ctx_id); atomic64_set(&mm->context.tlb_gen, 0); - #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS +#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS if (cpu_feature_enabled(X86_FEATURE_OSPKE)) { /* pkey 0 is the default and always allocated */ mm->context.pkey_allocation_map = 0x1; /* -1 means unallocated or invalid */ mm->context.execute_only_pkey = -1; } - #endif - return init_new_context_ldt(tsk, mm); +#endif + init_new_context_ldt(mm); + return 0; } static inline void destroy_context(struct mm_struct *mm) { @@ -181,7 +188,7 @@ do { \ static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm) { paravirt_arch_dup_mmap(oldmm, mm); - return 0; + return ldt_dup_context(oldmm, mm); } static inline void arch_exit_mmap(struct mm_struct *mm) --- a/arch/x86/kernel/ldt.c +++ b/arch/x86/kernel/ldt.c @@ -131,28 +131,20 @@ static void free_ldt_struct(struct ldt_s } /* - * we do not have to muck with descriptors here, that is - * done in switch_mm() as needed. + * Called on fork from arch_dup_mmap(). Just copy the current LDT state, + * the new task is not running, so nothing can be installed. */ -int init_new_context_ldt(struct task_struct *tsk, struct mm_struct *mm) +int ldt_dup_context(struct mm_struct *old_mm, struct mm_struct *mm) { struct ldt_struct *new_ldt; - struct mm_struct *old_mm; int retval = 0; - init_rwsem(&mm->context.ldt_usr_sem); - - old_mm = current->mm; - if (!old_mm) { - mm->context.ldt = NULL; + if (!old_mm) return 0; - } mutex_lock(&old_mm->context.lock); - if (!old_mm->context.ldt) { - mm->context.ldt = NULL; + if (!old_mm->context.ldt) goto out_unlock; - } new_ldt = alloc_ldt_struct(old_mm->context.ldt->nr_entries); if (!new_ldt) { --- a/tools/testing/selftests/x86/ldt_gdt.c +++ b/tools/testing/selftests/x86/ldt_gdt.c @@ -627,13 +627,10 @@ static void do_multicpu_tests(void) static int finish_exec_test(void) { /* - * In a sensible world, this would be check_invalid_segment(0, 1); - * For better or for worse, though, the LDT is inherited across exec. - * We can probably change this safely, but for now we test it. + * Older kernel versions did inherit the LDT on exec() which is + * wrong because exec() starts from a clean state. */ - check_valid_segment(0, 1, - AR_DPL3 | AR_TYPE_XRCODE | AR_S | AR_P | AR_DB, - 42, true); + check_invalid_segment(0, 1); return nerrs ? 1 : 0; } Patches currently in stable-queue which might be from tglx(a)linutronix.de are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-cpu_entry_area-prevent-wraparound-in-setup_cpu_entry_area_ptes-on-32bit.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/pci-pm-force-devices-to-d0-in-pci_pm_thaw_noirq.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/Kconfig: Limit NR_CPUS on 32-bit to a sane amount" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/Kconfig: Limit NR_CPUS on 32-bit to a sane amount to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 7bbcbd3d1cdcbacd0f9f8dc4c98d550972f1ca30 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner <tglx(a)linutronix.de> Date: Wed, 20 Dec 2017 18:02:34 +0100 Subject: x86/Kconfig: Limit NR_CPUS on 32-bit to a sane amount From: Thomas Gleixner <tglx(a)linutronix.de> commit 7bbcbd3d1cdcbacd0f9f8dc4c98d550972f1ca30 upstream. The recent cpu_entry_area changes fail to compile on 32-bit when BIGSMP=y and NR_CPUS=512, because the fixmap area becomes too big. Limit the number of CPUs with BIGSMP to 64, which is already way to big for 32-bit, but it's at least a working limitation. We performed a quick survey of 32-bit-only machines that might be affected by this change negatively, but found none. Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: linux-kernel(a)vger.kernel.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -925,7 +925,8 @@ config MAXSMP config NR_CPUS int "Maximum number of CPUs" if SMP && !MAXSMP range 2 8 if SMP && X86_32 && !X86_BIGSMP - range 2 512 if SMP && !MAXSMP && !CPUMASK_OFFSTACK + range 2 64 if SMP && X86_32 && X86_BIGSMP + range 2 512 if SMP && !MAXSMP && !CPUMASK_OFFSTACK && X86_64 range 2 8192 if SMP && !MAXSMP && CPUMASK_OFFSTACK && X86_64 default "1" if !SMP default "8192" if MAXSMP Patches currently in stable-queue which might be from tglx(a)linutronix.de are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-cpu_entry_area-prevent-wraparound-in-setup_cpu_entry_area_ptes-on-32bit.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/pci-pm-force-devices-to-d0-in-pci_pm_thaw_noirq.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 4fe2d8b11a370af286287a2661de9d4e6c9a145a Mon Sep 17 00:00:00 2001 From: Dave Hansen <dave.hansen(a)linux.intel.com> Date: Mon, 4 Dec 2017 17:25:07 -0800 Subject: x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack From: Dave Hansen <dave.hansen(a)linux.intel.com> commit 4fe2d8b11a370af286287a2661de9d4e6c9a145a upstream. If the kernel oopses while on the trampoline stack, it will print "<SYSENTER>" even if SYSENTER is not involved. That is rather confusing. The "SYSENTER" stack is used for a lot more than SYSENTER now. Give it a better string to display in stack dumps, and rename the kernel code to match. Also move the 32-bit code over to the new naming even though it still uses the entry stack only for SYSENTER. Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Borislav Petkov <bp(a)suse.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/entry/entry_32.S | 12 ++++++------ arch/x86/entry/entry_64.S | 4 ++-- arch/x86/include/asm/fixmap.h | 8 ++++---- arch/x86/include/asm/processor.h | 6 +++--- arch/x86/include/asm/stacktrace.h | 4 ++-- arch/x86/kernel/asm-offsets.c | 4 ++-- arch/x86/kernel/asm-offsets_32.c | 2 +- arch/x86/kernel/cpu/common.c | 14 +++++++------- arch/x86/kernel/dumpstack.c | 10 +++++----- arch/x86/kernel/dumpstack_32.c | 6 +++--- arch/x86/kernel/dumpstack_64.c | 12 +++++++++--- 11 files changed, 44 insertions(+), 38 deletions(-) --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -942,9 +942,9 @@ ENTRY(debug) /* Are we currently on the SYSENTER stack? */ movl PER_CPU_VAR(cpu_entry_area), %ecx - addl $CPU_ENTRY_AREA_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx - subl %eax, %ecx /* ecx = (end of SYSENTER_stack) - esp */ - cmpl $SIZEOF_SYSENTER_stack, %ecx + addl $CPU_ENTRY_AREA_entry_stack + SIZEOF_entry_stack, %ecx + subl %eax, %ecx /* ecx = (end of entry_stack) - esp */ + cmpl $SIZEOF_entry_stack, %ecx jb .Ldebug_from_sysenter_stack TRACE_IRQS_OFF @@ -986,9 +986,9 @@ ENTRY(nmi) /* Are we currently on the SYSENTER stack? */ movl PER_CPU_VAR(cpu_entry_area), %ecx - addl $CPU_ENTRY_AREA_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx - subl %eax, %ecx /* ecx = (end of SYSENTER_stack) - esp */ - cmpl $SIZEOF_SYSENTER_stack, %ecx + addl $CPU_ENTRY_AREA_entry_stack + SIZEOF_entry_stack, %ecx + subl %eax, %ecx /* ecx = (end of entry_stack) - esp */ + cmpl $SIZEOF_entry_stack, %ecx jb .Lnmi_from_sysenter_stack /* Not on SYSENTER stack. */ --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -154,8 +154,8 @@ END(native_usergs_sysret64) _entry_trampoline - CPU_ENTRY_AREA_entry_trampoline(%rip) /* The top word of the SYSENTER stack is hot and is usable as scratch space. */ -#define RSP_SCRATCH CPU_ENTRY_AREA_SYSENTER_stack + \ - SIZEOF_SYSENTER_stack - 8 + CPU_ENTRY_AREA +#define RSP_SCRATCH CPU_ENTRY_AREA_entry_stack + \ + SIZEOF_entry_stack - 8 + CPU_ENTRY_AREA ENTRY(entry_SYSCALL_64_trampoline) UNWIND_HINT_EMPTY --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -56,10 +56,10 @@ struct cpu_entry_area { char gdt[PAGE_SIZE]; /* - * The GDT is just below SYSENTER_stack and thus serves (on x86_64) as + * The GDT is just below entry_stack and thus serves (on x86_64) as * a a read-only guard page. */ - struct SYSENTER_stack_page SYSENTER_stack_page; + struct entry_stack_page entry_stack_page; /* * On x86_64, the TSS is mapped RO. On x86_32, it's mapped RW because @@ -250,9 +250,9 @@ static inline struct cpu_entry_area *get return (struct cpu_entry_area *)__fix_to_virt(__get_cpu_entry_area_page_index(cpu, 0)); } -static inline struct SYSENTER_stack *cpu_SYSENTER_stack(int cpu) +static inline struct entry_stack *cpu_entry_stack(int cpu) { - return &get_cpu_entry_area(cpu)->SYSENTER_stack_page.stack; + return &get_cpu_entry_area(cpu)->entry_stack_page.stack; } #endif /* !__ASSEMBLY__ */ --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -336,12 +336,12 @@ struct x86_hw_tss { #define IO_BITMAP_OFFSET (offsetof(struct tss_struct, io_bitmap) - offsetof(struct tss_struct, x86_tss)) #define INVALID_IO_BITMAP_OFFSET 0x8000 -struct SYSENTER_stack { +struct entry_stack { unsigned long words[64]; }; -struct SYSENTER_stack_page { - struct SYSENTER_stack stack; +struct entry_stack_page { + struct entry_stack stack; } __aligned(PAGE_SIZE); struct tss_struct { --- a/arch/x86/include/asm/stacktrace.h +++ b/arch/x86/include/asm/stacktrace.h @@ -16,7 +16,7 @@ enum stack_type { STACK_TYPE_TASK, STACK_TYPE_IRQ, STACK_TYPE_SOFTIRQ, - STACK_TYPE_SYSENTER, + STACK_TYPE_ENTRY, STACK_TYPE_EXCEPTION, STACK_TYPE_EXCEPTION_LAST = STACK_TYPE_EXCEPTION + N_EXCEPTION_STACKS-1, }; @@ -29,7 +29,7 @@ struct stack_info { bool in_task_stack(unsigned long *stack, struct task_struct *task, struct stack_info *info); -bool in_sysenter_stack(unsigned long *stack, struct stack_info *info); +bool in_entry_stack(unsigned long *stack, struct stack_info *info); int get_stack_info(unsigned long *stack, struct task_struct *task, struct stack_info *info, unsigned long *visit_mask); --- a/arch/x86/kernel/asm-offsets.c +++ b/arch/x86/kernel/asm-offsets.c @@ -97,6 +97,6 @@ void common(void) { /* Layout info for cpu_entry_area */ OFFSET(CPU_ENTRY_AREA_tss, cpu_entry_area, tss); OFFSET(CPU_ENTRY_AREA_entry_trampoline, cpu_entry_area, entry_trampoline); - OFFSET(CPU_ENTRY_AREA_SYSENTER_stack, cpu_entry_area, SYSENTER_stack_page); - DEFINE(SIZEOF_SYSENTER_stack, sizeof(struct SYSENTER_stack)); + OFFSET(CPU_ENTRY_AREA_entry_stack, cpu_entry_area, entry_stack_page); + DEFINE(SIZEOF_entry_stack, sizeof(struct entry_stack)); } --- a/arch/x86/kernel/asm-offsets_32.c +++ b/arch/x86/kernel/asm-offsets_32.c @@ -48,7 +48,7 @@ void foo(void) /* Offset from the sysenter stack to tss.sp0 */ DEFINE(TSS_sysenter_sp0, offsetof(struct cpu_entry_area, tss.x86_tss.sp0) - - offsetofend(struct cpu_entry_area, SYSENTER_stack_page.stack)); + offsetofend(struct cpu_entry_area, entry_stack_page.stack)); #ifdef CONFIG_CC_STACKPROTECTOR BLANK(); --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -487,8 +487,8 @@ static DEFINE_PER_CPU_PAGE_ALIGNED(char, [(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]); #endif -static DEFINE_PER_CPU_PAGE_ALIGNED(struct SYSENTER_stack_page, - SYSENTER_stack_storage); +static DEFINE_PER_CPU_PAGE_ALIGNED(struct entry_stack_page, + entry_stack_storage); static void __init set_percpu_fixmap_pages(int idx, void *ptr, int pages, pgprot_t prot) @@ -523,8 +523,8 @@ static void __init setup_cpu_entry_area( #endif __set_fixmap(get_cpu_entry_area_index(cpu, gdt), get_cpu_gdt_paddr(cpu), gdt_prot); - set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, SYSENTER_stack_page), - per_cpu_ptr(&SYSENTER_stack_storage, cpu), 1, + set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, entry_stack_page), + per_cpu_ptr(&entry_stack_storage, cpu), 1, PAGE_KERNEL); /* @@ -1323,7 +1323,7 @@ void enable_sep_cpu(void) tss->x86_tss.ss1 = __KERNEL_CS; wrmsr(MSR_IA32_SYSENTER_CS, tss->x86_tss.ss1, 0); - wrmsr(MSR_IA32_SYSENTER_ESP, (unsigned long)(cpu_SYSENTER_stack(cpu) + 1), 0); + wrmsr(MSR_IA32_SYSENTER_ESP, (unsigned long)(cpu_entry_stack(cpu) + 1), 0); wrmsr(MSR_IA32_SYSENTER_EIP, (unsigned long)entry_SYSENTER_32, 0); put_cpu(); @@ -1440,7 +1440,7 @@ void syscall_init(void) * AMD doesn't allow SYSENTER in long mode (either 32- or 64-bit). */ wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS); - wrmsrl_safe(MSR_IA32_SYSENTER_ESP, (unsigned long)(cpu_SYSENTER_stack(cpu) + 1)); + wrmsrl_safe(MSR_IA32_SYSENTER_ESP, (unsigned long)(cpu_entry_stack(cpu) + 1)); wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat); #else wrmsrl(MSR_CSTAR, (unsigned long)ignore_sysret); @@ -1655,7 +1655,7 @@ void cpu_init(void) */ set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss); load_TR_desc(); - load_sp0((unsigned long)(cpu_SYSENTER_stack(cpu) + 1)); + load_sp0((unsigned long)(cpu_entry_stack(cpu) + 1)); load_mm_ldt(&init_mm); --- a/arch/x86/kernel/dumpstack.c +++ b/arch/x86/kernel/dumpstack.c @@ -43,9 +43,9 @@ bool in_task_stack(unsigned long *stack, return true; } -bool in_sysenter_stack(unsigned long *stack, struct stack_info *info) +bool in_entry_stack(unsigned long *stack, struct stack_info *info) { - struct SYSENTER_stack *ss = cpu_SYSENTER_stack(smp_processor_id()); + struct entry_stack *ss = cpu_entry_stack(smp_processor_id()); void *begin = ss; void *end = ss + 1; @@ -53,7 +53,7 @@ bool in_sysenter_stack(unsigned long *st if ((void *)stack < begin || (void *)stack >= end) return false; - info->type = STACK_TYPE_SYSENTER; + info->type = STACK_TYPE_ENTRY; info->begin = begin; info->end = end; info->next_sp = NULL; @@ -111,13 +111,13 @@ void show_trace_log_lvl(struct task_stru * - task stack * - interrupt stack * - HW exception stacks (double fault, nmi, debug, mce) - * - SYSENTER stack + * - entry stack * * x86-32 can have up to four stacks: * - task stack * - softirq stack * - hardirq stack - * - SYSENTER stack + * - entry stack */ for (regs = NULL; stack; stack = PTR_ALIGN(stack_info.next_sp, sizeof(long))) { const char *stack_name; --- a/arch/x86/kernel/dumpstack_32.c +++ b/arch/x86/kernel/dumpstack_32.c @@ -26,8 +26,8 @@ const char *stack_type_name(enum stack_t if (type == STACK_TYPE_SOFTIRQ) return "SOFTIRQ"; - if (type == STACK_TYPE_SYSENTER) - return "SYSENTER"; + if (type == STACK_TYPE_ENTRY) + return "ENTRY_TRAMPOLINE"; return NULL; } @@ -96,7 +96,7 @@ int get_stack_info(unsigned long *stack, if (task != current) goto unknown; - if (in_sysenter_stack(stack, info)) + if (in_entry_stack(stack, info)) goto recursion_check; if (in_hardirq_stack(stack, info)) --- a/arch/x86/kernel/dumpstack_64.c +++ b/arch/x86/kernel/dumpstack_64.c @@ -37,8 +37,14 @@ const char *stack_type_name(enum stack_t if (type == STACK_TYPE_IRQ) return "IRQ"; - if (type == STACK_TYPE_SYSENTER) - return "SYSENTER"; + if (type == STACK_TYPE_ENTRY) { + /* + * On 64-bit, we have a generic entry stack that we + * use for all the kernel entry points, including + * SYSENTER. + */ + return "ENTRY_TRAMPOLINE"; + } if (type >= STACK_TYPE_EXCEPTION && type <= STACK_TYPE_EXCEPTION_LAST) return exception_stack_names[type - STACK_TYPE_EXCEPTION]; @@ -118,7 +124,7 @@ int get_stack_info(unsigned long *stack, if (in_irq_stack(stack, info)) goto recursion_check; - if (in_sysenter_stack(stack, info)) + if (in_entry_stack(stack, info)) goto recursion_check; goto unknown; Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/doc: Remove obvious weirdnesses from the x86 MM layout documentation" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/doc: Remove obvious weirdnesses from the x86 MM layout documentation to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From e8ffe96e5933d417195268478479933d56213a3f Mon Sep 17 00:00:00 2001 From: Peter Zijlstra <peterz(a)infradead.org> Date: Tue, 5 Dec 2017 13:34:54 +0100 Subject: x86/doc: Remove obvious weirdnesses from the x86 MM layout documentation From: Peter Zijlstra <peterz(a)infradead.org> commit e8ffe96e5933d417195268478479933d56213a3f upstream. Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: David Laight <David.Laight(a)aculab.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: Eduardo Valentin <eduval(a)amazon.com> Cc: Greg KH <gregkh(a)linuxfoundation.org> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Will Deacon <will.deacon(a)arm.com> Cc: aliguori(a)amazon.com Cc: daniel.gruss(a)iaik.tugraz.at Cc: hughd(a)google.com Cc: keescook(a)google.com Cc: linux-mm(a)kvack.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- Documentation/x86/x86_64/mm.txt | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) --- a/Documentation/x86/x86_64/mm.txt +++ b/Documentation/x86/x86_64/mm.txt @@ -1,6 +1,4 @@ -<previous description obsolete, deleted> - Virtual memory map with 4 level page tables: 0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm @@ -49,8 +47,9 @@ ffffffffffe00000 - ffffffffffffffff (=2 Architecture defines a 64-bit virtual address. Implementations can support less. Currently supported are 48- and 57-bit virtual addresses. Bits 63 -through to the most-significant implemented bit are set to either all ones -or all zero. This causes hole between user space and kernel addresses. +through to the most-significant implemented bit are sign extended. +This causes hole between user space and kernel addresses if you interpret them +as unsigned. The direct mapping covers all memory in the system up to the highest memory address (this means in some cases it can also include PCI memory @@ -60,9 +59,6 @@ vmalloc space is lazily synchronized int the processes using the page fault handler, with init_top_pgt as reference. -Current X86-64 implementations support up to 46 bits of address space (64 TB), -which is our current limit. This expands into MBZ space in the page tables. - We map EFI runtime services in the 'efi_pgd' PGD in a 64Gb large virtual memory window (this size is arbitrary, it can be raised later if needed). The mappings are not part of any other kernel PGD and are only available @@ -74,5 +70,3 @@ following fixmap section. Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all physical memory, vmalloc/ioremap space and virtual memory map are randomized. Their order is preserved but their base will be offset early at boot time. - --Andi Kleen, Jul 2004 Patches currently in stable-queue which might be from peterz(a)infradead.org are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/cpu_entry_area: Move it to a separate unit" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/cpu_entry_area: Move it to a separate unit to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-cpu_entry_area-move-it-to-a-separate-unit.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From ed1bbc40a0d10e0c5c74fe7bdc6298295cf40255 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner <tglx(a)linutronix.de> Date: Wed, 20 Dec 2017 18:28:54 +0100 Subject: x86/cpu_entry_area: Move it to a separate unit From: Thomas Gleixner <tglx(a)linutronix.de> commit ed1bbc40a0d10e0c5c74fe7bdc6298295cf40255 upstream. Separate the cpu_entry_area code out of cpu/common.c and the fixmap. Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/cpu_entry_area.h | 52 +++++++++++++++++ arch/x86/include/asm/fixmap.h | 41 ------------- arch/x86/kernel/cpu/common.c | 94 ------------------------------ arch/x86/kernel/traps.c | 1 arch/x86/mm/Makefile | 2 arch/x86/mm/cpu_entry_area.c | 104 ++++++++++++++++++++++++++++++++++ 6 files changed, 159 insertions(+), 135 deletions(-) --- /dev/null +++ b/arch/x86/include/asm/cpu_entry_area.h @@ -0,0 +1,52 @@ +// SPDX-License-Identifier: GPL-2.0 + +#ifndef _ASM_X86_CPU_ENTRY_AREA_H +#define _ASM_X86_CPU_ENTRY_AREA_H + +#include <linux/percpu-defs.h> +#include <asm/processor.h> + +/* + * cpu_entry_area is a percpu region that contains things needed by the CPU + * and early entry/exit code. Real types aren't used for all fields here + * to avoid circular header dependencies. + * + * Every field is a virtual alias of some other allocated backing store. + * There is no direct allocation of a struct cpu_entry_area. + */ +struct cpu_entry_area { + char gdt[PAGE_SIZE]; + + /* + * The GDT is just below entry_stack and thus serves (on x86_64) as + * a a read-only guard page. + */ + struct entry_stack_page entry_stack_page; + + /* + * On x86_64, the TSS is mapped RO. On x86_32, it's mapped RW because + * we need task switches to work, and task switches write to the TSS. + */ + struct tss_struct tss; + + char entry_trampoline[PAGE_SIZE]; + +#ifdef CONFIG_X86_64 + /* + * Exception stacks used for IST entries. + * + * In the future, this should have a separate slot for each stack + * with guard pages between them. + */ + char exception_stacks[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]; +#endif +}; + +#define CPU_ENTRY_AREA_SIZE (sizeof(struct cpu_entry_area)) +#define CPU_ENTRY_AREA_PAGES (CPU_ENTRY_AREA_SIZE / PAGE_SIZE) + +DECLARE_PER_CPU(struct cpu_entry_area *, cpu_entry_area); + +extern void setup_cpu_entry_areas(void); + +#endif --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -25,6 +25,7 @@ #else #include <uapi/asm/vsyscall.h> #endif +#include <asm/cpu_entry_area.h> /* * We can't declare FIXADDR_TOP as variable for x86_64 because vsyscall @@ -45,46 +46,6 @@ extern unsigned long __FIXADDR_TOP; #endif /* - * cpu_entry_area is a percpu region in the fixmap that contains things - * needed by the CPU and early entry/exit code. Real types aren't used - * for all fields here to avoid circular header dependencies. - * - * Every field is a virtual alias of some other allocated backing store. - * There is no direct allocation of a struct cpu_entry_area. - */ -struct cpu_entry_area { - char gdt[PAGE_SIZE]; - - /* - * The GDT is just below entry_stack and thus serves (on x86_64) as - * a a read-only guard page. - */ - struct entry_stack_page entry_stack_page; - - /* - * On x86_64, the TSS is mapped RO. On x86_32, it's mapped RW because - * we need task switches to work, and task switches write to the TSS. - */ - struct tss_struct tss; - - char entry_trampoline[PAGE_SIZE]; - -#ifdef CONFIG_X86_64 - /* - * Exception stacks used for IST entries. - * - * In the future, this should have a separate slot for each stack - * with guard pages between them. - */ - char exception_stacks[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]; -#endif -}; - -#define CPU_ENTRY_AREA_PAGES (sizeof(struct cpu_entry_area) / PAGE_SIZE) - -extern void setup_cpu_entry_areas(void); - -/* * Here we define all the compile-time 'special' virtual * addresses. The point is to have a constant address at * compile time, but to set the physical address only --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -482,102 +482,8 @@ static const unsigned int exception_stac [0 ... N_EXCEPTION_STACKS - 1] = EXCEPTION_STKSZ, [DEBUG_STACK - 1] = DEBUG_STKSZ }; - -static DEFINE_PER_CPU_PAGE_ALIGNED(char, exception_stacks - [(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]); -#endif - -static DEFINE_PER_CPU_PAGE_ALIGNED(struct entry_stack_page, - entry_stack_storage); - -static void __init -set_percpu_fixmap_pages(int idx, void *ptr, int pages, pgprot_t prot) -{ - for ( ; pages; pages--, idx--, ptr += PAGE_SIZE) - __set_fixmap(idx, per_cpu_ptr_to_phys(ptr), prot); -} - -/* Setup the fixmap mappings only once per-processor */ -static void __init setup_cpu_entry_area(int cpu) -{ -#ifdef CONFIG_X86_64 - extern char _entry_trampoline[]; - - /* On 64-bit systems, we use a read-only fixmap GDT and TSS. */ - pgprot_t gdt_prot = PAGE_KERNEL_RO; - pgprot_t tss_prot = PAGE_KERNEL_RO; -#else - /* - * On native 32-bit systems, the GDT cannot be read-only because - * our double fault handler uses a task gate, and entering through - * a task gate needs to change an available TSS to busy. If the - * GDT is read-only, that will triple fault. The TSS cannot be - * read-only because the CPU writes to it on task switches. - * - * On Xen PV, the GDT must be read-only because the hypervisor - * requires it. - */ - pgprot_t gdt_prot = boot_cpu_has(X86_FEATURE_XENPV) ? - PAGE_KERNEL_RO : PAGE_KERNEL; - pgprot_t tss_prot = PAGE_KERNEL; #endif - __set_fixmap(get_cpu_entry_area_index(cpu, gdt), get_cpu_gdt_paddr(cpu), gdt_prot); - set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, entry_stack_page), - per_cpu_ptr(&entry_stack_storage, cpu), 1, - PAGE_KERNEL); - - /* - * The Intel SDM says (Volume 3, 7.2.1): - * - * Avoid placing a page boundary in the part of the TSS that the - * processor reads during a task switch (the first 104 bytes). The - * processor may not correctly perform address translations if a - * boundary occurs in this area. During a task switch, the processor - * reads and writes into the first 104 bytes of each TSS (using - * contiguous physical addresses beginning with the physical address - * of the first byte of the TSS). So, after TSS access begins, if - * part of the 104 bytes is not physically contiguous, the processor - * will access incorrect information without generating a page-fault - * exception. - * - * There are also a lot of errata involving the TSS spanning a page - * boundary. Assert that we're not doing that. - */ - BUILD_BUG_ON((offsetof(struct tss_struct, x86_tss) ^ - offsetofend(struct tss_struct, x86_tss)) & PAGE_MASK); - BUILD_BUG_ON(sizeof(struct tss_struct) % PAGE_SIZE != 0); - set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, tss), - &per_cpu(cpu_tss_rw, cpu), - sizeof(struct tss_struct) / PAGE_SIZE, - tss_prot); - -#ifdef CONFIG_X86_32 - per_cpu(cpu_entry_area, cpu) = get_cpu_entry_area(cpu); -#endif - -#ifdef CONFIG_X86_64 - BUILD_BUG_ON(sizeof(exception_stacks) % PAGE_SIZE != 0); - BUILD_BUG_ON(sizeof(exception_stacks) != - sizeof(((struct cpu_entry_area *)0)->exception_stacks)); - set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, exception_stacks), - &per_cpu(exception_stacks, cpu), - sizeof(exception_stacks) / PAGE_SIZE, - PAGE_KERNEL); - - __set_fixmap(get_cpu_entry_area_index(cpu, entry_trampoline), - __pa_symbol(_entry_trampoline), PAGE_KERNEL_RX); -#endif -} - -void __init setup_cpu_entry_areas(void) -{ - unsigned int cpu; - - for_each_possible_cpu(cpu) - setup_cpu_entry_area(cpu); -} - /* Load the original GDT from the per-cpu structure */ void load_direct_gdt(int cpu) { --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -52,6 +52,7 @@ #include <asm/traps.h> #include <asm/desc.h> #include <asm/fpu/internal.h> +#include <asm/cpu_entry_area.h> #include <asm/mce.h> #include <asm/fixmap.h> #include <asm/mach_traps.h> --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -10,7 +10,7 @@ CFLAGS_REMOVE_mem_encrypt.o = -pg endif obj-y := init.o init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \ - pat.o pgtable.o physaddr.o setup_nx.o tlb.o + pat.o pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o # Make sure __phys_addr has no stackprotector nostackp := $(call cc-option, -fno-stack-protector) --- /dev/null +++ b/arch/x86/mm/cpu_entry_area.c @@ -0,0 +1,104 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <linux/spinlock.h> +#include <linux/percpu.h> + +#include <asm/cpu_entry_area.h> +#include <asm/pgtable.h> +#include <asm/fixmap.h> +#include <asm/desc.h> + +static DEFINE_PER_CPU_PAGE_ALIGNED(struct entry_stack_page, entry_stack_storage); + +#ifdef CONFIG_X86_64 +static DEFINE_PER_CPU_PAGE_ALIGNED(char, exception_stacks + [(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]); +#endif + +static void __init +set_percpu_fixmap_pages(int idx, void *ptr, int pages, pgprot_t prot) +{ + for ( ; pages; pages--, idx--, ptr += PAGE_SIZE) + __set_fixmap(idx, per_cpu_ptr_to_phys(ptr), prot); +} + +/* Setup the fixmap mappings only once per-processor */ +static void __init setup_cpu_entry_area(int cpu) +{ +#ifdef CONFIG_X86_64 + extern char _entry_trampoline[]; + + /* On 64-bit systems, we use a read-only fixmap GDT and TSS. */ + pgprot_t gdt_prot = PAGE_KERNEL_RO; + pgprot_t tss_prot = PAGE_KERNEL_RO; +#else + /* + * On native 32-bit systems, the GDT cannot be read-only because + * our double fault handler uses a task gate, and entering through + * a task gate needs to change an available TSS to busy. If the + * GDT is read-only, that will triple fault. The TSS cannot be + * read-only because the CPU writes to it on task switches. + * + * On Xen PV, the GDT must be read-only because the hypervisor + * requires it. + */ + pgprot_t gdt_prot = boot_cpu_has(X86_FEATURE_XENPV) ? + PAGE_KERNEL_RO : PAGE_KERNEL; + pgprot_t tss_prot = PAGE_KERNEL; +#endif + + __set_fixmap(get_cpu_entry_area_index(cpu, gdt), get_cpu_gdt_paddr(cpu), gdt_prot); + set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, entry_stack_page), + per_cpu_ptr(&entry_stack_storage, cpu), 1, + PAGE_KERNEL); + + /* + * The Intel SDM says (Volume 3, 7.2.1): + * + * Avoid placing a page boundary in the part of the TSS that the + * processor reads during a task switch (the first 104 bytes). The + * processor may not correctly perform address translations if a + * boundary occurs in this area. During a task switch, the processor + * reads and writes into the first 104 bytes of each TSS (using + * contiguous physical addresses beginning with the physical address + * of the first byte of the TSS). So, after TSS access begins, if + * part of the 104 bytes is not physically contiguous, the processor + * will access incorrect information without generating a page-fault + * exception. + * + * There are also a lot of errata involving the TSS spanning a page + * boundary. Assert that we're not doing that. + */ + BUILD_BUG_ON((offsetof(struct tss_struct, x86_tss) ^ + offsetofend(struct tss_struct, x86_tss)) & PAGE_MASK); + BUILD_BUG_ON(sizeof(struct tss_struct) % PAGE_SIZE != 0); + set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, tss), + &per_cpu(cpu_tss_rw, cpu), + sizeof(struct tss_struct) / PAGE_SIZE, + tss_prot); + +#ifdef CONFIG_X86_32 + per_cpu(cpu_entry_area, cpu) = get_cpu_entry_area(cpu); +#endif + +#ifdef CONFIG_X86_64 + BUILD_BUG_ON(sizeof(exception_stacks) % PAGE_SIZE != 0); + BUILD_BUG_ON(sizeof(exception_stacks) != + sizeof(((struct cpu_entry_area *)0)->exception_stacks)); + set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, exception_stacks), + &per_cpu(exception_stacks, cpu), + sizeof(exception_stacks) / PAGE_SIZE, + PAGE_KERNEL); + + __set_fixmap(get_cpu_entry_area_index(cpu, entry_trampoline), + __pa_symbol(_entry_trampoline), PAGE_KERNEL_RX); +#endif +} + +void __init setup_cpu_entry_areas(void) +{ + unsigned int cpu; + + for_each_possible_cpu(cpu) + setup_cpu_entry_area(cpu); +} Patches currently in stable-queue which might be from tglx(a)linutronix.de are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-cpu_entry_area-prevent-wraparound-in-setup_cpu_entry_area_ptes-on-32bit.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/pci-pm-force-devices-to-d0-in-pci_pm_thaw_noirq.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/cpu_entry_area: Move it out of the fixmap" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/cpu_entry_area: Move it out of the fixmap to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-cpu_entry_area-move-it-out-of-the-fixmap.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 92a0f81d89571e3e8759366e050ee05cc545ef99 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner <tglx(a)linutronix.de> Date: Wed, 20 Dec 2017 18:51:31 +0100 Subject: x86/cpu_entry_area: Move it out of the fixmap From: Thomas Gleixner <tglx(a)linutronix.de> commit 92a0f81d89571e3e8759366e050ee05cc545ef99 upstream. Put the cpu_entry_area into a separate P4D entry. The fixmap gets too big and 0-day already hit a case where the fixmap PTEs were cleared by cleanup_highmap(). Aside of that the fixmap API is a pain as it's all backwards. Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: linux-kernel(a)vger.kernel.org Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- Documentation/x86/x86_64/mm.txt | 2 arch/x86/include/asm/cpu_entry_area.h | 18 ++++++++ arch/x86/include/asm/desc.h | 1 arch/x86/include/asm/fixmap.h | 32 --------------- arch/x86/include/asm/pgtable_32_types.h | 15 +++++-- arch/x86/include/asm/pgtable_64_types.h | 47 +++++++++++++--------- arch/x86/kernel/dumpstack.c | 1 arch/x86/kernel/traps.c | 5 +- arch/x86/mm/cpu_entry_area.c | 66 ++++++++++++++++++++++++-------- arch/x86/mm/dump_pagetables.c | 6 ++ arch/x86/mm/init_32.c | 6 ++ arch/x86/mm/kasan_init_64.c | 29 +++++++------- arch/x86/mm/pgtable_32.c | 1 arch/x86/xen/mmu_pv.c | 2 14 files changed, 143 insertions(+), 88 deletions(-) --- a/Documentation/x86/x86_64/mm.txt +++ b/Documentation/x86/x86_64/mm.txt @@ -12,6 +12,7 @@ ffffea0000000000 - ffffeaffffffffff (=40 ... unused hole ... ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB) ... unused hole ... +fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks ... unused hole ... ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space @@ -35,6 +36,7 @@ ffd4000000000000 - ffd5ffffffffffff (=49 ... unused hole ... ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB) ... unused hole ... +fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks ... unused hole ... ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space --- a/arch/x86/include/asm/cpu_entry_area.h +++ b/arch/x86/include/asm/cpu_entry_area.h @@ -43,10 +43,26 @@ struct cpu_entry_area { }; #define CPU_ENTRY_AREA_SIZE (sizeof(struct cpu_entry_area)) -#define CPU_ENTRY_AREA_PAGES (CPU_ENTRY_AREA_SIZE / PAGE_SIZE) +#define CPU_ENTRY_AREA_TOT_SIZE (CPU_ENTRY_AREA_SIZE * NR_CPUS) DECLARE_PER_CPU(struct cpu_entry_area *, cpu_entry_area); extern void setup_cpu_entry_areas(void); +extern void cea_set_pte(void *cea_vaddr, phys_addr_t pa, pgprot_t flags); + +#define CPU_ENTRY_AREA_RO_IDT CPU_ENTRY_AREA_BASE +#define CPU_ENTRY_AREA_PER_CPU (CPU_ENTRY_AREA_RO_IDT + PAGE_SIZE) + +#define CPU_ENTRY_AREA_RO_IDT_VADDR ((void *)CPU_ENTRY_AREA_RO_IDT) + +#define CPU_ENTRY_AREA_MAP_SIZE \ + (CPU_ENTRY_AREA_PER_CPU + CPU_ENTRY_AREA_TOT_SIZE - CPU_ENTRY_AREA_BASE) + +extern struct cpu_entry_area *get_cpu_entry_area(int cpu); + +static inline struct entry_stack *cpu_entry_stack(int cpu) +{ + return &get_cpu_entry_area(cpu)->entry_stack_page.stack; +} #endif --- a/arch/x86/include/asm/desc.h +++ b/arch/x86/include/asm/desc.h @@ -7,6 +7,7 @@ #include <asm/mmu.h> #include <asm/fixmap.h> #include <asm/irq_vectors.h> +#include <asm/cpu_entry_area.h> #include <linux/smp.h> #include <linux/percpu.h> --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -25,7 +25,6 @@ #else #include <uapi/asm/vsyscall.h> #endif -#include <asm/cpu_entry_area.h> /* * We can't declare FIXADDR_TOP as variable for x86_64 because vsyscall @@ -84,7 +83,6 @@ enum fixed_addresses { FIX_IO_APIC_BASE_0, FIX_IO_APIC_BASE_END = FIX_IO_APIC_BASE_0 + MAX_IO_APICS - 1, #endif - FIX_RO_IDT, /* Virtual mapping for read-only IDT */ #ifdef CONFIG_X86_32 FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */ FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, @@ -100,9 +98,6 @@ enum fixed_addresses { #ifdef CONFIG_X86_INTEL_MID FIX_LNW_VRTC, #endif - /* Fixmap entries to remap the GDTs, one per processor. */ - FIX_CPU_ENTRY_AREA_TOP, - FIX_CPU_ENTRY_AREA_BOTTOM = FIX_CPU_ENTRY_AREA_TOP + (CPU_ENTRY_AREA_PAGES * NR_CPUS) - 1, #ifdef CONFIG_ACPI_APEI_GHES /* Used for GHES mapping from assorted contexts */ @@ -143,7 +138,7 @@ enum fixed_addresses { extern void reserve_top_address(unsigned long reserve); #define FIXADDR_SIZE (__end_of_permanent_fixed_addresses << PAGE_SHIFT) -#define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE) +#define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE) extern int fixmaps_set; @@ -191,30 +186,5 @@ void __init *early_memremap_decrypted_wp void __early_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t flags); -static inline unsigned int __get_cpu_entry_area_page_index(int cpu, int page) -{ - BUILD_BUG_ON(sizeof(struct cpu_entry_area) % PAGE_SIZE != 0); - - return FIX_CPU_ENTRY_AREA_BOTTOM - cpu*CPU_ENTRY_AREA_PAGES - page; -} - -#define __get_cpu_entry_area_offset_index(cpu, offset) ({ \ - BUILD_BUG_ON(offset % PAGE_SIZE != 0); \ - __get_cpu_entry_area_page_index(cpu, offset / PAGE_SIZE); \ - }) - -#define get_cpu_entry_area_index(cpu, field) \ - __get_cpu_entry_area_offset_index((cpu), offsetof(struct cpu_entry_area, field)) - -static inline struct cpu_entry_area *get_cpu_entry_area(int cpu) -{ - return (struct cpu_entry_area *)__fix_to_virt(__get_cpu_entry_area_page_index(cpu, 0)); -} - -static inline struct entry_stack *cpu_entry_stack(int cpu) -{ - return &get_cpu_entry_area(cpu)->entry_stack_page.stack; -} - #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_FIXMAP_H */ --- a/arch/x86/include/asm/pgtable_32_types.h +++ b/arch/x86/include/asm/pgtable_32_types.h @@ -38,13 +38,22 @@ extern bool __vmalloc_start_set; /* set #define LAST_PKMAP 1024 #endif -#define PKMAP_BASE ((FIXADDR_START - PAGE_SIZE * (LAST_PKMAP + 1)) \ - & PMD_MASK) +/* + * Define this here and validate with BUILD_BUG_ON() in pgtable_32.c + * to avoid include recursion hell + */ +#define CPU_ENTRY_AREA_PAGES (NR_CPUS * 40) + +#define CPU_ENTRY_AREA_BASE \ + ((FIXADDR_START - PAGE_SIZE * (CPU_ENTRY_AREA_PAGES + 1)) & PMD_MASK) + +#define PKMAP_BASE \ + ((CPU_ENTRY_AREA_BASE - PAGE_SIZE) & PMD_MASK) #ifdef CONFIG_HIGHMEM # define VMALLOC_END (PKMAP_BASE - 2 * PAGE_SIZE) #else -# define VMALLOC_END (FIXADDR_START - 2 * PAGE_SIZE) +# define VMALLOC_END (CPU_ENTRY_AREA_BASE - 2 * PAGE_SIZE) #endif #define MODULES_VADDR VMALLOC_START --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -76,32 +76,41 @@ typedef struct { pteval_t pte; } pte_t; #define PGDIR_MASK (~(PGDIR_SIZE - 1)) /* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */ -#define MAXMEM _AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL) +#define MAXMEM _AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL) + #ifdef CONFIG_X86_5LEVEL -#define VMALLOC_SIZE_TB _AC(16384, UL) -#define __VMALLOC_BASE _AC(0xff92000000000000, UL) -#define __VMEMMAP_BASE _AC(0xffd4000000000000, UL) +# define VMALLOC_SIZE_TB _AC(16384, UL) +# define __VMALLOC_BASE _AC(0xff92000000000000, UL) +# define __VMEMMAP_BASE _AC(0xffd4000000000000, UL) #else -#define VMALLOC_SIZE_TB _AC(32, UL) -#define __VMALLOC_BASE _AC(0xffffc90000000000, UL) -#define __VMEMMAP_BASE _AC(0xffffea0000000000, UL) +# define VMALLOC_SIZE_TB _AC(32, UL) +# define __VMALLOC_BASE _AC(0xffffc90000000000, UL) +# define __VMEMMAP_BASE _AC(0xffffea0000000000, UL) #endif + #ifdef CONFIG_RANDOMIZE_MEMORY -#define VMALLOC_START vmalloc_base -#define VMEMMAP_START vmemmap_base +# define VMALLOC_START vmalloc_base +# define VMEMMAP_START vmemmap_base #else -#define VMALLOC_START __VMALLOC_BASE -#define VMEMMAP_START __VMEMMAP_BASE +# define VMALLOC_START __VMALLOC_BASE +# define VMEMMAP_START __VMEMMAP_BASE #endif /* CONFIG_RANDOMIZE_MEMORY */ -#define VMALLOC_END (VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 1, UL)) -#define MODULES_VADDR (__START_KERNEL_map + KERNEL_IMAGE_SIZE) + +#define VMALLOC_END (VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 1, UL)) + +#define MODULES_VADDR (__START_KERNEL_map + KERNEL_IMAGE_SIZE) /* The module sections ends with the start of the fixmap */ -#define MODULES_END __fix_to_virt(__end_of_fixed_addresses + 1) -#define MODULES_LEN (MODULES_END - MODULES_VADDR) -#define ESPFIX_PGD_ENTRY _AC(-2, UL) -#define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << P4D_SHIFT) -#define EFI_VA_START ( -4 * (_AC(1, UL) << 30)) -#define EFI_VA_END (-68 * (_AC(1, UL) << 30)) +#define MODULES_END __fix_to_virt(__end_of_fixed_addresses + 1) +#define MODULES_LEN (MODULES_END - MODULES_VADDR) + +#define ESPFIX_PGD_ENTRY _AC(-2, UL) +#define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << P4D_SHIFT) + +#define CPU_ENTRY_AREA_PGD _AC(-3, UL) +#define CPU_ENTRY_AREA_BASE (CPU_ENTRY_AREA_PGD << P4D_SHIFT) + +#define EFI_VA_START ( -4 * (_AC(1, UL) << 30)) +#define EFI_VA_END (-68 * (_AC(1, UL) << 30)) #define EARLY_DYNAMIC_PAGE_TABLES 64 --- a/arch/x86/kernel/dumpstack.c +++ b/arch/x86/kernel/dumpstack.c @@ -18,6 +18,7 @@ #include <linux/nmi.h> #include <linux/sysfs.h> +#include <asm/cpu_entry_area.h> #include <asm/stacktrace.h> #include <asm/unwind.h> --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -951,8 +951,9 @@ void __init trap_init(void) * "sidt" instruction will not leak the location of the kernel, and * to defend the IDT against arbitrary memory write vulnerabilities. * It will be reloaded in cpu_init() */ - __set_fixmap(FIX_RO_IDT, __pa_symbol(idt_table), PAGE_KERNEL_RO); - idt_descr.address = fix_to_virt(FIX_RO_IDT); + cea_set_pte(CPU_ENTRY_AREA_RO_IDT_VADDR, __pa_symbol(idt_table), + PAGE_KERNEL_RO); + idt_descr.address = CPU_ENTRY_AREA_RO_IDT; /* * Should be a barrier for any external CPU state: --- a/arch/x86/mm/cpu_entry_area.c +++ b/arch/x86/mm/cpu_entry_area.c @@ -15,11 +15,27 @@ static DEFINE_PER_CPU_PAGE_ALIGNED(char, [(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]); #endif +struct cpu_entry_area *get_cpu_entry_area(int cpu) +{ + unsigned long va = CPU_ENTRY_AREA_PER_CPU + cpu * CPU_ENTRY_AREA_SIZE; + BUILD_BUG_ON(sizeof(struct cpu_entry_area) % PAGE_SIZE != 0); + + return (struct cpu_entry_area *) va; +} +EXPORT_SYMBOL(get_cpu_entry_area); + +void cea_set_pte(void *cea_vaddr, phys_addr_t pa, pgprot_t flags) +{ + unsigned long va = (unsigned long) cea_vaddr; + + set_pte_vaddr(va, pfn_pte(pa >> PAGE_SHIFT, flags)); +} + static void __init -set_percpu_fixmap_pages(int idx, void *ptr, int pages, pgprot_t prot) +cea_map_percpu_pages(void *cea_vaddr, void *ptr, int pages, pgprot_t prot) { - for ( ; pages; pages--, idx--, ptr += PAGE_SIZE) - __set_fixmap(idx, per_cpu_ptr_to_phys(ptr), prot); + for ( ; pages; pages--, cea_vaddr+= PAGE_SIZE, ptr += PAGE_SIZE) + cea_set_pte(cea_vaddr, per_cpu_ptr_to_phys(ptr), prot); } /* Setup the fixmap mappings only once per-processor */ @@ -47,10 +63,12 @@ static void __init setup_cpu_entry_area( pgprot_t tss_prot = PAGE_KERNEL; #endif - __set_fixmap(get_cpu_entry_area_index(cpu, gdt), get_cpu_gdt_paddr(cpu), gdt_prot); - set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, entry_stack_page), - per_cpu_ptr(&entry_stack_storage, cpu), 1, - PAGE_KERNEL); + cea_set_pte(&get_cpu_entry_area(cpu)->gdt, get_cpu_gdt_paddr(cpu), + gdt_prot); + + cea_map_percpu_pages(&get_cpu_entry_area(cpu)->entry_stack_page, + per_cpu_ptr(&entry_stack_storage, cpu), 1, + PAGE_KERNEL); /* * The Intel SDM says (Volume 3, 7.2.1): @@ -72,10 +90,9 @@ static void __init setup_cpu_entry_area( BUILD_BUG_ON((offsetof(struct tss_struct, x86_tss) ^ offsetofend(struct tss_struct, x86_tss)) & PAGE_MASK); BUILD_BUG_ON(sizeof(struct tss_struct) % PAGE_SIZE != 0); - set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, tss), - &per_cpu(cpu_tss_rw, cpu), - sizeof(struct tss_struct) / PAGE_SIZE, - tss_prot); + cea_map_percpu_pages(&get_cpu_entry_area(cpu)->tss, + &per_cpu(cpu_tss_rw, cpu), + sizeof(struct tss_struct) / PAGE_SIZE, tss_prot); #ifdef CONFIG_X86_32 per_cpu(cpu_entry_area, cpu) = get_cpu_entry_area(cpu); @@ -85,20 +102,37 @@ static void __init setup_cpu_entry_area( BUILD_BUG_ON(sizeof(exception_stacks) % PAGE_SIZE != 0); BUILD_BUG_ON(sizeof(exception_stacks) != sizeof(((struct cpu_entry_area *)0)->exception_stacks)); - set_percpu_fixmap_pages(get_cpu_entry_area_index(cpu, exception_stacks), - &per_cpu(exception_stacks, cpu), - sizeof(exception_stacks) / PAGE_SIZE, - PAGE_KERNEL); + cea_map_percpu_pages(&get_cpu_entry_area(cpu)->exception_stacks, + &per_cpu(exception_stacks, cpu), + sizeof(exception_stacks) / PAGE_SIZE, PAGE_KERNEL); - __set_fixmap(get_cpu_entry_area_index(cpu, entry_trampoline), + cea_set_pte(&get_cpu_entry_area(cpu)->entry_trampoline, __pa_symbol(_entry_trampoline), PAGE_KERNEL_RX); #endif } +static __init void setup_cpu_entry_area_ptes(void) +{ +#ifdef CONFIG_X86_32 + unsigned long start, end; + + BUILD_BUG_ON(CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE); + BUG_ON(CPU_ENTRY_AREA_BASE & ~PMD_MASK); + + start = CPU_ENTRY_AREA_BASE; + end = start + CPU_ENTRY_AREA_MAP_SIZE; + + for (; start < end; start += PMD_SIZE) + populate_extra_pte(start); +#endif +} + void __init setup_cpu_entry_areas(void) { unsigned int cpu; + setup_cpu_entry_area_ptes(); + for_each_possible_cpu(cpu) setup_cpu_entry_area(cpu); } --- a/arch/x86/mm/dump_pagetables.c +++ b/arch/x86/mm/dump_pagetables.c @@ -58,6 +58,7 @@ enum address_markers_idx { KASAN_SHADOW_START_NR, KASAN_SHADOW_END_NR, #endif + CPU_ENTRY_AREA_NR, #ifdef CONFIG_X86_ESPFIX64 ESPFIX_START_NR, #endif @@ -81,6 +82,7 @@ static struct addr_marker address_marker [KASAN_SHADOW_START_NR] = { KASAN_SHADOW_START, "KASAN shadow" }, [KASAN_SHADOW_END_NR] = { KASAN_SHADOW_END, "KASAN shadow end" }, #endif + [CPU_ENTRY_AREA_NR] = { CPU_ENTRY_AREA_BASE,"CPU entry Area" }, #ifdef CONFIG_X86_ESPFIX64 [ESPFIX_START_NR] = { ESPFIX_BASE_ADDR, "ESPfix Area", 16 }, #endif @@ -104,6 +106,7 @@ enum address_markers_idx { #ifdef CONFIG_HIGHMEM PKMAP_BASE_NR, #endif + CPU_ENTRY_AREA_NR, FIXADDR_START_NR, END_OF_SPACE_NR, }; @@ -116,6 +119,7 @@ static struct addr_marker address_marker #ifdef CONFIG_HIGHMEM [PKMAP_BASE_NR] = { 0UL, "Persistent kmap() Area" }, #endif + [CPU_ENTRY_AREA_NR] = { 0UL, "CPU entry area" }, [FIXADDR_START_NR] = { 0UL, "Fixmap area" }, [END_OF_SPACE_NR] = { -1, NULL } }; @@ -541,8 +545,8 @@ static int __init pt_dump_init(void) address_markers[PKMAP_BASE_NR].start_address = PKMAP_BASE; # endif address_markers[FIXADDR_START_NR].start_address = FIXADDR_START; + address_markers[CPU_ENTRY_AREA_NR].start_address = CPU_ENTRY_AREA_BASE; #endif - return 0; } __initcall(pt_dump_init); --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -50,6 +50,7 @@ #include <asm/setup.h> #include <asm/set_memory.h> #include <asm/page_types.h> +#include <asm/cpu_entry_area.h> #include <asm/init.h> #include "mm_internal.h" @@ -766,6 +767,7 @@ void __init mem_init(void) mem_init_print_info(NULL); printk(KERN_INFO "virtual kernel memory layout:\n" " fixmap : 0x%08lx - 0x%08lx (%4ld kB)\n" + " cpu_entry : 0x%08lx - 0x%08lx (%4ld kB)\n" #ifdef CONFIG_HIGHMEM " pkmap : 0x%08lx - 0x%08lx (%4ld kB)\n" #endif @@ -777,6 +779,10 @@ void __init mem_init(void) FIXADDR_START, FIXADDR_TOP, (FIXADDR_TOP - FIXADDR_START) >> 10, + CPU_ENTRY_AREA_BASE, + CPU_ENTRY_AREA_BASE + CPU_ENTRY_AREA_MAP_SIZE, + CPU_ENTRY_AREA_MAP_SIZE >> 10, + #ifdef CONFIG_HIGHMEM PKMAP_BASE, PKMAP_BASE+LAST_PKMAP*PAGE_SIZE, (LAST_PKMAP*PAGE_SIZE) >> 10, --- a/arch/x86/mm/kasan_init_64.c +++ b/arch/x86/mm/kasan_init_64.c @@ -15,6 +15,7 @@ #include <asm/tlbflush.h> #include <asm/sections.h> #include <asm/pgtable.h> +#include <asm/cpu_entry_area.h> extern struct range pfn_mapped[E820_MAX_ENTRIES]; @@ -322,31 +323,33 @@ void __init kasan_init(void) map_range(&pfn_mapped[i]); } - kasan_populate_zero_shadow( - kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM), - kasan_mem_to_shadow((void *)__START_KERNEL_map)); - - kasan_populate_shadow((unsigned long)kasan_mem_to_shadow(_stext), - (unsigned long)kasan_mem_to_shadow(_end), - early_pfn_to_nid(__pa(_stext))); - - shadow_cpu_entry_begin = (void *)__fix_to_virt(FIX_CPU_ENTRY_AREA_BOTTOM); + shadow_cpu_entry_begin = (void *)CPU_ENTRY_AREA_BASE; shadow_cpu_entry_begin = kasan_mem_to_shadow(shadow_cpu_entry_begin); shadow_cpu_entry_begin = (void *)round_down((unsigned long)shadow_cpu_entry_begin, PAGE_SIZE); - shadow_cpu_entry_end = (void *)(__fix_to_virt(FIX_CPU_ENTRY_AREA_TOP) + PAGE_SIZE); + shadow_cpu_entry_end = (void *)(CPU_ENTRY_AREA_BASE + + CPU_ENTRY_AREA_MAP_SIZE); shadow_cpu_entry_end = kasan_mem_to_shadow(shadow_cpu_entry_end); shadow_cpu_entry_end = (void *)round_up((unsigned long)shadow_cpu_entry_end, PAGE_SIZE); - kasan_populate_zero_shadow(kasan_mem_to_shadow((void *)MODULES_END), - shadow_cpu_entry_begin); + kasan_populate_zero_shadow( + kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM), + shadow_cpu_entry_begin); kasan_populate_shadow((unsigned long)shadow_cpu_entry_begin, (unsigned long)shadow_cpu_entry_end, 0); - kasan_populate_zero_shadow(shadow_cpu_entry_end, (void *)KASAN_SHADOW_END); + kasan_populate_zero_shadow(shadow_cpu_entry_end, + kasan_mem_to_shadow((void *)__START_KERNEL_map)); + + kasan_populate_shadow((unsigned long)kasan_mem_to_shadow(_stext), + (unsigned long)kasan_mem_to_shadow(_end), + early_pfn_to_nid(__pa(_stext))); + + kasan_populate_zero_shadow(kasan_mem_to_shadow((void *)MODULES_END), + (void *)KASAN_SHADOW_END); load_cr3(init_top_pgt); __flush_tlb_all(); --- a/arch/x86/mm/pgtable_32.c +++ b/arch/x86/mm/pgtable_32.c @@ -10,6 +10,7 @@ #include <linux/pagemap.h> #include <linux/spinlock.h> +#include <asm/cpu_entry_area.h> #include <asm/pgtable.h> #include <asm/pgalloc.h> #include <asm/fixmap.h> --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -2261,7 +2261,6 @@ static void xen_set_fixmap(unsigned idx, switch (idx) { case FIX_BTMAP_END ... FIX_BTMAP_BEGIN: - case FIX_RO_IDT: #ifdef CONFIG_X86_32 case FIX_WP_TEST: # ifdef CONFIG_HIGHMEM @@ -2272,7 +2271,6 @@ static void xen_set_fixmap(unsigned idx, #endif case FIX_TEXT_POKE0: case FIX_TEXT_POKE1: - case FIX_CPU_ENTRY_AREA_TOP ... FIX_CPU_ENTRY_AREA_BOTTOM: /* All local page mappings */ pte = pfn_pte(phys, prot); break; Patches currently in stable-queue which might be from tglx(a)linutronix.de are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-cpu_entry_area-prevent-wraparound-in-setup_cpu_entry_area_ptes-on-32bit.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/pci-pm-force-devices-to-d0-in-pci_pm_thaw_noirq.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/cpu_entry_area: Prevent wraparound in setup_cpu_entry_area_ptes() on 32bit" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/cpu_entry_area: Prevent wraparound in setup_cpu_entry_area_ptes() on 32bit to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-cpu_entry_area-prevent-wraparound-in-setup_cpu_entry_area_ptes-on-32bit.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From f6c4fd506cb626e4346aa81688f255e593a7c5a0 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner <tglx(a)linutronix.de> Date: Sat, 23 Dec 2017 19:45:11 +0100 Subject: x86/cpu_entry_area: Prevent wraparound in setup_cpu_entry_area_ptes() on 32bit From: Thomas Gleixner <tglx(a)linutronix.de> commit f6c4fd506cb626e4346aa81688f255e593a7c5a0 upstream. The loop which populates the CPU entry area PMDs can wrap around on 32bit machines when the number of CPUs is small. It worked wonderful for NR_CPUS=64 for whatever reason and the moron who wrote that code did not bother to test it with !SMP. Check for the wraparound to fix it. Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap") Reported-by: kernel test robot <fengguang.wu(a)intel.com> Signed-off-by: Thomas "Feels stupid" Gleixner <tglx(a)linutronix.de> Tested-by: Borislav Petkov <bp(a)alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/mm/cpu_entry_area.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/arch/x86/mm/cpu_entry_area.c +++ b/arch/x86/mm/cpu_entry_area.c @@ -122,7 +122,8 @@ static __init void setup_cpu_entry_area_ start = CPU_ENTRY_AREA_BASE; end = start + CPU_ENTRY_AREA_MAP_SIZE; - for (; start < end; start += PMD_SIZE) + /* Careful here: start + PMD_SIZE might wrap around */ + for (; start < end && start >= CPU_ENTRY_AREA_BASE; start += PMD_SIZE) populate_extra_pte(start); #endif } Patches currently in stable-queue which might be from tglx(a)linutronix.de are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-cpu_entry_area-prevent-wraparound-in-setup_cpu_entry_area_ptes-on-32bit.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/pci-pm-force-devices-to-d0-in-pci_pm_thaw_noirq.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "spi: xilinx: Detect stall with Unknown commands" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled spi: xilinx: Detect stall with Unknown commands to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: spi-xilinx-detect-stall-with-unknown-commands.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 5a1314fa697fc65cefaba64cd4699bfc3e6882a6 Mon Sep 17 00:00:00 2001 From: Ricardo Ribalda <ricardo.ribalda(a)gmail.com> Date: Tue, 21 Nov 2017 10:09:02 +0100 Subject: spi: xilinx: Detect stall with Unknown commands From: Ricardo Ribalda Delgado <ricardo.ribalda(a)gmail.com> commit 5a1314fa697fc65cefaba64cd4699bfc3e6882a6 upstream. When the core is configured in C_SPI_MODE > 0, it integrates a lookup table that automatically configures the core in dual or quad mode based on the command (first byte on the tx fifo). Unfortunately, that list mode_?_memoy_*.mif does not contain all the supported commands by the flash. Since 4.14 spi-nor automatically tries to probe the flash using SFDP (command 0x5a), and that command is not part of the list_mode table. Whit the right combination of C_SPI_MODE and C_SPI_MEMORY this leads into a stall that can only be recovered with a soft rest. This patch detects this kind of stall and returns -EIO to the caller on those commands. spi-nor can handle this error properly: m25p80 spi0.0: Detected stall. Check C_SPI_MODE and C_SPI_MEMORY. 0x21 0x2404 m25p80 spi0.0: SPI transfer failed: -5 spi_master spi0: failed to transfer one message from queue m25p80 spi0.0: s25sl064p (8192 Kbytes) Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda(a)gmail.com> Signed-off-by: Mark Brown <broonie(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/spi/spi-xilinx.c | 11 +++++++++++ 1 file changed, 11 insertions(+) --- a/drivers/spi/spi-xilinx.c +++ b/drivers/spi/spi-xilinx.c @@ -271,6 +271,7 @@ static int xilinx_spi_txrx_bufs(struct s while (remaining_words) { int n_words, tx_words, rx_words; u32 sr; + int stalled; n_words = min(remaining_words, xspi->buffer_size); @@ -299,7 +300,17 @@ static int xilinx_spi_txrx_bufs(struct s /* Read out all the data from the Rx FIFO */ rx_words = n_words; + stalled = 10; while (rx_words) { + if (rx_words == n_words && !(stalled--) && + !(sr & XSPI_SR_TX_EMPTY_MASK) && + (sr & XSPI_SR_RX_EMPTY_MASK)) { + dev_err(&spi->dev, + "Detected stall. Check C_SPI_MODE and C_SPI_MEMORY\n"); + xspi_init_hw(xspi); + return -EIO; + } + if ((sr & XSPI_SR_TX_EMPTY_MASK) && (rx_words > 1)) { xilinx_spi_rx(xspi); rx_words--; Patches currently in stable-queue which might be from ricardo.ribalda(a)gmail.com are queue-4.14/spi-xilinx-detect-stall-with-unknown-commands.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "spi: a3700: Fix clk prescaling for coefficient over 15" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled spi: a3700: Fix clk prescaling for coefficient over 15 to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: spi-a3700-fix-clk-prescaling-for-coefficient-over-15.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 251c201bf4f8b5bf4f1ccb4f8920eed2e1f57580 Mon Sep 17 00:00:00 2001 From: Maxime Chevallier <maxime.chevallier(a)smile.fr> Date: Mon, 27 Nov 2017 15:16:32 +0100 Subject: spi: a3700: Fix clk prescaling for coefficient over 15 From: Maxime Chevallier <maxime.chevallier(a)smile.fr> commit 251c201bf4f8b5bf4f1ccb4f8920eed2e1f57580 upstream. The Armada 3700 SPI controller has 2 ranges of prescaler coefficients. One ranging from 0 to 15 by steps of 1, and one ranging from 0 to 30 by steps of 2. This commit fixes the prescaler coefficients that are over 15 so that it uses the correct range of values. The prescaling coefficient is rounded to the upper value if it is odd. This was tested on Espressobin with spidev and a locigal analyser. Signed-off-by: Maxime Chevallier <maxime.chevallier(a)smile.fr> Signed-off-by: Mark Brown <broonie(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/spi/spi-armada-3700.c | 8 ++++++++ 1 file changed, 8 insertions(+) --- a/drivers/spi/spi-armada-3700.c +++ b/drivers/spi/spi-armada-3700.c @@ -79,6 +79,7 @@ #define A3700_SPI_BYTE_LEN BIT(5) #define A3700_SPI_CLK_PRESCALE BIT(0) #define A3700_SPI_CLK_PRESCALE_MASK (0x1f) +#define A3700_SPI_CLK_EVEN_OFFS (0x10) #define A3700_SPI_WFIFO_THRS_BIT 28 #define A3700_SPI_RFIFO_THRS_BIT 24 @@ -220,6 +221,13 @@ static void a3700_spi_clock_set(struct a prescale = DIV_ROUND_UP(clk_get_rate(a3700_spi->clk), speed_hz); + /* For prescaler values over 15, we can only set it by steps of 2. + * Starting from A3700_SPI_CLK_EVEN_OFFS, we set values from 0 up to + * 30. We only use this range from 16 to 30. + */ + if (prescale > 15) + prescale = A3700_SPI_CLK_EVEN_OFFS + DIV_ROUND_UP(prescale, 2); + val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG); val = val & ~A3700_SPI_CLK_PRESCALE_MASK; Patches currently in stable-queue which might be from maxime.chevallier(a)smile.fr are queue-4.14/spi-a3700-fix-clk-prescaling-for-coefficient-over-15.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "Revert "parisc: Re-enable interrupts early"" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled Revert "parisc: Re-enable interrupts early" to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: revert-parisc-re-enable-interrupts-early.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 9352aeada4d8d8753fc0e414fbfe8fdfcb68a12c Mon Sep 17 00:00:00 2001 From: John David Anglin <dave.anglin(a)bell.net> Date: Mon, 13 Nov 2017 19:35:33 -0500 Subject: Revert "parisc: Re-enable interrupts early" From: John David Anglin <dave.anglin(a)bell.net> commit 9352aeada4d8d8753fc0e414fbfe8fdfcb68a12c upstream. This reverts commit 5c38602d83e584047906b41b162ababd4db4106d. Interrupts can't be enabled early because the register saves are done on the thread stack prior to switching to the IRQ stack. This caused stack overflows and the thread stack needed increasing to 32k. Even then, stack overflows still occasionally occurred. Background: Even with a 32 kB thread stack, I have seen instances where the thread stack overflowed on the mx3210 buildd. Detection of stack overflow only occurs when we have an external interrupt. When an external interrupt occurs, we switch to the thread stack if we are not already on a kernel stack. Then, registers and specials are saved to the kernel stack. The bug occurs in intr_return where interrupts are reenabled prior to returning from the interrupt. This was done incase we need to schedule or deliver signals. However, it introduces the possibility that multiple external interrupts may occur on the thread stack and cause a stack overflow. These might not be detected and cause the kernel to misbehave in random ways. This patch changes the code back to only reenable interrupts when we are going to schedule or deliver signals. As a result, we generally return from an interrupt before reenabling interrupts. This minimizes the growth of the thread stack. Fixes: 5c38602d83e5 ("parisc: Re-enable interrupts early") Signed-off-by: John David Anglin <dave.anglin(a)bell.net> Signed-off-by: Helge Deller <deller(a)gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/parisc/kernel/entry.S | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) --- a/arch/parisc/kernel/entry.S +++ b/arch/parisc/kernel/entry.S @@ -878,9 +878,6 @@ ENTRY_CFI(syscall_exit_rfi) STREG %r19,PT_SR7(%r16) intr_return: - /* NOTE: Need to enable interrupts incase we schedule. */ - ssm PSW_SM_I, %r0 - /* check for reschedule */ mfctl %cr30,%r1 LDREG TI_FLAGS(%r1),%r19 /* sched.h: TIF_NEED_RESCHED */ @@ -907,6 +904,11 @@ intr_check_sig: LDREG PT_IASQ1(%r16), %r20 cmpib,COND(=),n 0,%r20,intr_restore /* backward */ + /* NOTE: We need to enable interrupts if we have to deliver + * signals. We used to do this earlier but it caused kernel + * stack overflows. */ + ssm PSW_SM_I, %r0 + copy %r0, %r25 /* long in_syscall = 0 */ #ifdef CONFIG_64BIT ldo -16(%r30),%r29 /* Reference param save area */ @@ -958,6 +960,10 @@ intr_do_resched: cmpib,COND(=) 0, %r20, intr_do_preempt nop + /* NOTE: We need to enable interrupts if we schedule. We used + * to do this earlier but it caused kernel stack overflows. */ + ssm PSW_SM_I, %r0 + #ifdef CONFIG_64BIT ldo -16(%r30),%r29 /* Reference param save area */ #endif Patches currently in stable-queue which might be from dave.anglin(a)bell.net are queue-4.14/revert-parisc-re-enable-interrupts-early.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] [PATCH] Revert "ipmi_si: fix memory leak on new_smi"

by John Einar Reitan

This reverts commit c97e41076a298dbc4e910c33048e553658388eed, which incorrectly was taken from upstream c0a32fe13cd323ca9420500b16fd69589c9ba91e. The referenced memory leak doesn't exist on the 4.14 stable branch as the new logic of doing the kzalloc hasn't moved to this function. By adding this kfree we actually end up doing double kfree as all callers of smi_add does a kfree on error. Sample with SLAB_FREELIST_HARDENED=y: ipmi_si: Adding ACPI-specified kcs state machine IPMI System Interface driver. ipmi_si: probing via SPMI ipmi_si: SPMI: io 0xca2 regsize 1 spacing 1 irq 0 (NULL device *): SPMI-specified kcs state machine: duplicate ------------[ cut here ]------------ kernel BUG at mm/slub.c:295! invalid opcode: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.8-gentoo-r1 #5 Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.2 02/20/2015 task: ffff88080c208000 task.stack: ffffc90000020000 RIP: 0010:kfree+0xf5/0x157 RSP: 0000:ffffc90000023e58 EFLAGS: 00010246 RAX: ffff88080b2e6200 RBX: ffff88080b2e6200 RCX: ffff88080b2e6200 RDX: 000000000000008e RSI: ffff88082fc1cd60 RDI: ffff88080c003080 RBP: ffffc90000002808 R08: 000000000001cd60 R09: ffffffff814da10e R10: ffffea00202cb980 R11: 000000000000005c R12: ffffffff814da10e R13: 00000000ffffffed R14: ffffffff82317bd0 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff88082fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000002e09001 CR4: 00000000001606f0 Call Trace: init_ipmi_si+0x493/0x5c7 ? cleanup_ipmi_si+0x84/0x84 ? set_debug_rodata+0xc/0xc ? kthread+0x4c/0x11c do_one_initcall+0x94/0x13d ? set_debug_rodata+0xc/0xc kernel_init_freeable+0x112/0x18e ? rest_init+0xa0/0xa0 kernel_init+0x5/0xe1 ret_from_fork+0x22/0x30 Code: 24 18 49 8b 7a 30 48 8b 37 65 48 8b 56 08 65 48 03 35 3a 29 e2 7e 4c 3b 56 10 75 39 48 8b 0e 48 63 47 20 48 01 d8 48 39 cb 75 02 <0f> 0b 49 89 c0 4c 33 87 40 01 00 00 4c 31 c1 48 89 08 48 8d 4a ---[ end trace 4ac2e2c100842676 ]--- Signed-off-by: John Einar Reitan <john.einar(a)gmail.com> --- drivers/char/ipmi/ipmi_si_intf.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c index e1cbb78c6806..c04aa11f0e21 100644 --- a/drivers/char/ipmi/ipmi_si_intf.c +++ b/drivers/char/ipmi/ipmi_si_intf.c @@ -3469,7 +3469,6 @@ static int add_smi(struct smi_info *new_smi) ipmi_addr_src_to_str(new_smi->addr_source), si_to_str[new_smi->si_type]); rv = -EBUSY; - kfree(new_smi); goto out_err; } } -- 2.13.6

7 years, 8 months

2
1
0 0

[Linux-stable-mirror] Patch "powerpc/perf: Dereference BHRB entries safely" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled powerpc/perf: Dereference BHRB entries safely to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: powerpc-perf-dereference-bhrb-entries-safely.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From f41d84dddc66b164ac16acf3f584c276146f1c48 Mon Sep 17 00:00:00 2001 From: Ravi Bangoria <ravi.bangoria(a)linux.vnet.ibm.com> Date: Tue, 12 Dec 2017 17:59:15 +0530 Subject: powerpc/perf: Dereference BHRB entries safely From: Ravi Bangoria <ravi.bangoria(a)linux.vnet.ibm.com> commit f41d84dddc66b164ac16acf3f584c276146f1c48 upstream. It's theoretically possible that branch instructions recorded in BHRB (Branch History Rolling Buffer) entries have already been unmapped before they are processed by the kernel. Hence, trying to dereference such memory location will result in a crash. eg: Unable to handle kernel paging request for data at address 0xd000000019c41764 Faulting instruction address: 0xc000000000084a14 NIP [c000000000084a14] branch_target+0x4/0x70 LR [c0000000000eb828] record_and_restart+0x568/0x5c0 Call Trace: [c0000000000eb3b4] record_and_restart+0xf4/0x5c0 (unreliable) [c0000000000ec378] perf_event_interrupt+0x298/0x460 [c000000000027964] performance_monitor_exception+0x54/0x70 [c000000000009ba4] performance_monitor_common+0x114/0x120 Fix it by deferefencing the addresses safely. Fixes: 691231846ceb ("powerpc/perf: Fix setting of "to" addresses for BHRB") Suggested-by: Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com> Signed-off-by: Ravi Bangoria <ravi.bangoria(a)linux.vnet.ibm.com> Reviewed-by: Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com> [mpe: Use probe_kernel_read() which is clearer, tweak change log] Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/powerpc/perf/core-book3s.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -410,8 +410,12 @@ static __u64 power_pmu_bhrb_to(u64 addr) int ret; __u64 target; - if (is_kernel_addr(addr)) - return branch_target((unsigned int *)addr); + if (is_kernel_addr(addr)) { + if (probe_kernel_read(&instr, (void *)addr, sizeof(instr))) + return 0; + + return branch_target(&instr); + } /* Userspace: need copy instruction here then translate it */ pagefault_disable(); Patches currently in stable-queue which might be from ravi.bangoria(a)linux.vnet.ibm.com are queue-4.14/powerpc-perf-dereference-bhrb-entries-safely.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "pinctrl: cherryview: Mask all interrupts on Intel_Strago based systems" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled pinctrl: cherryview: Mask all interrupts on Intel_Strago based systems to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: pinctrl-cherryview-mask-all-interrupts-on-intel_strago-based-systems.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From d2b3c353595a855794f8b9df5b5bdbe8deb0c413 Mon Sep 17 00:00:00 2001 From: Mika Westerberg <mika.westerberg(a)linux.intel.com> Date: Mon, 4 Dec 2017 12:11:02 +0300 Subject: pinctrl: cherryview: Mask all interrupts on Intel_Strago based systems From: Mika Westerberg <mika.westerberg(a)linux.intel.com> commit d2b3c353595a855794f8b9df5b5bdbe8deb0c413 upstream. Guenter Roeck reported an interrupt storm on a prototype system which is based on Cyan Chromebook. The root cause turned out to be a incorrectly configured pin that triggers spurious interrupts. This will be fixed in coreboot but currently we need to prevent the interrupt storm from happening by masking all interrupts (but not GPEs) on those systems. Link: https://bugzilla.kernel.org/show_bug.cgi?id=197953 Fixes: bcb48cca23ec ("pinctrl: cherryview: Do not mask all interrupts in probe") Reported-and-tested-by: Guenter Roeck <linux(a)roeck-us.net> Reported-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com> Signed-off-by: Mika Westerberg <mika.westerberg(a)linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/pinctrl/intel/pinctrl-cherryview.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) --- a/drivers/pinctrl/intel/pinctrl-cherryview.c +++ b/drivers/pinctrl/intel/pinctrl-cherryview.c @@ -1620,6 +1620,22 @@ static int chv_gpio_probe(struct chv_pin clear_bit(i, chip->irq_valid_mask); } + /* + * The same set of machines in chv_no_valid_mask[] have incorrectly + * configured GPIOs that generate spurious interrupts so we use + * this same list to apply another quirk for them. + * + * See also https://bugzilla.kernel.org/show_bug.cgi?id=197953. + */ + if (!need_valid_mask) { + /* + * Mask all interrupts the community is able to generate + * but leave the ones that can only generate GPEs unmasked. + */ + chv_writel(GENMASK(31, pctrl->community->nirqs), + pctrl->regs + CHV_INTMASK); + } + /* Clear all interrupts */ chv_writel(0xffff, pctrl->regs + CHV_INTSTAT); Patches currently in stable-queue which might be from mika.westerberg(a)linux.intel.com are queue-4.14/pinctrl-cherryview-mask-all-interrupts-on-intel_strago-based-systems.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "parisc: Hide Diva-built-in serial aux and graphics card" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled parisc: Hide Diva-built-in serial aux and graphics card to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: parisc-hide-diva-built-in-serial-aux-and-graphics-card.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From bcf3f1752a622f1372d3252d0fea8855d89812e7 Mon Sep 17 00:00:00 2001 From: Helge Deller <deller(a)gmx.de> Date: Tue, 12 Dec 2017 21:52:26 +0100 Subject: parisc: Hide Diva-built-in serial aux and graphics card From: Helge Deller <deller(a)gmx.de> commit bcf3f1752a622f1372d3252d0fea8855d89812e7 upstream. Diva GSP card has built-in serial AUX port and ATI graphic card which simply don't work and which both don't have external connectors. User Guides even mention that those devices shouldn't be used. So, prevent that Linux drivers try to enable those devices. Signed-off-by: Helge Deller <deller(a)gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/parisc/lba_pci.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) --- a/drivers/parisc/lba_pci.c +++ b/drivers/parisc/lba_pci.c @@ -1692,3 +1692,36 @@ void lba_set_iregs(struct parisc_device iounmap(base_addr); } + +/* + * The design of the Diva management card in rp34x0 machines (rp3410, rp3440) + * seems rushed, so that many built-in components simply don't work. + * The following quirks disable the serial AUX port and the built-in ATI RV100 + * Radeon 7000 graphics card which both don't have any external connectors and + * thus are useless, and even worse, e.g. the AUX port occupies ttyS0 and as + * such makes those machines the only PARISC machines on which we can't use + * ttyS0 as boot console. + */ +static void quirk_diva_ati_card(struct pci_dev *dev) +{ + if (dev->subsystem_vendor != PCI_VENDOR_ID_HP || + dev->subsystem_device != 0x1292) + return; + + dev_info(&dev->dev, "Hiding Diva built-in ATI card"); + dev->device = 0; +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_RADEON_QY, + quirk_diva_ati_card); + +static void quirk_diva_aux_disable(struct pci_dev *dev) +{ + if (dev->subsystem_vendor != PCI_VENDOR_ID_HP || + dev->subsystem_device != 0x1291) + return; + + dev_info(&dev->dev, "Hiding Diva built-in AUX serial device"); + dev->device = 0; +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_DIVA_AUX, + quirk_diva_aux_disable); Patches currently in stable-queue which might be from deller(a)gmx.de are queue-4.14/parisc-align-os_hpmc_size-on-word-boundary.patch queue-4.14/parisc-fix-indenting-in-puts.patch queue-4.14/revert-parisc-re-enable-interrupts-early.patch queue-4.14/parisc-hide-diva-built-in-serial-aux-and-graphics-card.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "PCI / PM: Force devices to D0 in pci_pm_thaw_noirq()" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled PCI / PM: Force devices to D0 in pci_pm_thaw_noirq() to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: pci-pm-force-devices-to-d0-in-pci_pm_thaw_noirq.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 5839ee7389e893a31e4e3c9cf17b50d14103c902 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com> Date: Fri, 15 Dec 2017 03:07:18 +0100 Subject: PCI / PM: Force devices to D0 in pci_pm_thaw_noirq() From: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> commit 5839ee7389e893a31e4e3c9cf17b50d14103c902 upstream. It is incorrect to call pci_restore_state() for devices in low-power states (D1-D3), as that involves the restoration of MSI setup which requires MMIO to be operational and that is only the case in D0. However, pci_pm_thaw_noirq() may do that if the driver's "freeze" callbacks put the device into a low-power state, so fix it by making it force devices into D0 via pci_set_power_state() instead of trying to "update" their power state which is pointless. Fixes: e60514bd4485 (PCI/PM: Restore the status of PCI devices across hibernation) Reported-by: Thomas Gleixner <tglx(a)linutronix.de> Reported-by: Maarten Lankhorst <dev(a)mblankhorst.nl> Tested-by: Thomas Gleixner <tglx(a)linutronix.de> Tested-by: Maarten Lankhorst <dev(a)mblankhorst.nl> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> Acked-by: Bjorn Helgaas <bhelgaas(a)google.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/pci/pci-driver.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -968,7 +968,12 @@ static int pci_pm_thaw_noirq(struct devi if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_resume_early(dev); - pci_update_current_state(pci_dev, PCI_D0); + /* + * pci_restore_state() requires the device to be in D0 (because of MSI + * restoration among other things), so force it into D0 in case the + * driver's "freeze" callbacks put it into a low-power state directly. + */ + pci_set_power_state(pci_dev, PCI_D0); pci_restore_state(pci_dev); if (drv && drv->pm && drv->pm->thaw_noirq) Patches currently in stable-queue which might be from rafael.j.wysocki(a)intel.com are queue-4.14/pci-pm-force-devices-to-d0-in-pci_pm_thaw_noirq.patch queue-4.14/acpi-apei-erst-fix-missing-error-handling-in-erst_reader.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "net: mvneta: use proper rxq_number in loop on rx queues" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled net: mvneta: use proper rxq_number in loop on rx queues to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: net-mvneta-use-proper-rxq_number-in-loop-on-rx-queues.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From ca5902a6547f662419689ca28b3c29a772446caa Mon Sep 17 00:00:00 2001 From: Yelena Krivosheev <yelena(a)marvell.com> Date: Tue, 19 Dec 2017 17:59:46 +0100 Subject: net: mvneta: use proper rxq_number in loop on rx queues From: Yelena Krivosheev <yelena(a)marvell.com> commit ca5902a6547f662419689ca28b3c29a772446caa upstream. When adding the RX queue association with each CPU, a typo was made in the mvneta_cleanup_rxqs() function. This patch fixes it. [gregory.clement(a)free-electrons.com: add commit log and fixes tag] Fixes: 2dcf75e2793c ("net: mvneta: Associate RX queues with each CPU") Signed-off-by: Yelena Krivosheev <yelena(a)marvell.com> Tested-by: Dmitri Epshtein <dima(a)marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement(a)free-electrons.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/net/ethernet/marvell/mvneta.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -3015,7 +3015,7 @@ static void mvneta_cleanup_rxqs(struct m { int queue; - for (queue = 0; queue < txq_number; queue++) + for (queue = 0; queue < rxq_number; queue++) mvneta_rxq_deinit(pp, &pp->rxqs[queue]); } Patches currently in stable-queue which might be from yelena(a)marvell.com are queue-4.14/net-mvneta-eliminate-wrong-call-to-handle-rx-descriptor-error.patch queue-4.14/net-mvneta-clear-interface-link-status-on-port-disable.patch queue-4.14/net-mvneta-use-proper-rxq_number-in-loop-on-rx-queues.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "parisc: Fix indenting in puts()" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled parisc: Fix indenting in puts() to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: parisc-fix-indenting-in-puts.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 203c110b39a89b48156c7450504e454fedb7f7f6 Mon Sep 17 00:00:00 2001 From: Helge Deller <deller(a)gmx.de> Date: Tue, 12 Dec 2017 21:32:16 +0100 Subject: parisc: Fix indenting in puts() From: Helge Deller <deller(a)gmx.de> commit 203c110b39a89b48156c7450504e454fedb7f7f6 upstream. Static analysis tools complain that we intended to have curly braces around this indent block. In this case this assumption is wrong, so fix the indenting. Fixes: 2f3c7b8137ef ("parisc: Add core code for self-extracting kernel") Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com> Signed-off-by: Helge Deller <deller(a)gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/parisc/boot/compressed/misc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/arch/parisc/boot/compressed/misc.c +++ b/arch/parisc/boot/compressed/misc.c @@ -123,8 +123,8 @@ int puts(const char *s) while ((nuline = strchr(s, '\n')) != NULL) { if (nuline != s) pdc_iodc_print(s, nuline - s); - pdc_iodc_print("\r\n", 2); - s = nuline + 1; + pdc_iodc_print("\r\n", 2); + s = nuline + 1; } if (*s != '\0') pdc_iodc_print(s, strlen(s)); Patches currently in stable-queue which might be from deller(a)gmx.de are queue-4.14/parisc-align-os_hpmc_size-on-word-boundary.patch queue-4.14/parisc-fix-indenting-in-puts.patch queue-4.14/revert-parisc-re-enable-interrupts-early.patch queue-4.14/parisc-hide-diva-built-in-serial-aux-and-graphics-card.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "parisc: Align os_hpmc_size on word boundary" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled parisc: Align os_hpmc_size on word boundary to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: parisc-align-os_hpmc_size-on-word-boundary.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 0ed9d3de5f8f97e6efd5ca0e3377cab5f0451ead Mon Sep 17 00:00:00 2001 From: Helge Deller <deller(a)gmx.de> Date: Tue, 12 Dec 2017 21:25:41 +0100 Subject: parisc: Align os_hpmc_size on word boundary From: Helge Deller <deller(a)gmx.de> commit 0ed9d3de5f8f97e6efd5ca0e3377cab5f0451ead upstream. The os_hpmc_size variable sometimes wasn't aligned at word boundary and thus triggered the unaligned fault handler at startup. Fix it by aligning it properly. Signed-off-by: Helge Deller <deller(a)gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/parisc/kernel/hpmc.S | 1 + 1 file changed, 1 insertion(+) --- a/arch/parisc/kernel/hpmc.S +++ b/arch/parisc/kernel/hpmc.S @@ -305,6 +305,7 @@ ENDPROC_CFI(os_hpmc) __INITRODATA + .align 4 .export os_hpmc_size os_hpmc_size: .word .os_hpmc_end-.os_hpmc Patches currently in stable-queue which might be from deller(a)gmx.de are queue-4.14/parisc-align-os_hpmc_size-on-word-boundary.patch queue-4.14/parisc-fix-indenting-in-puts.patch queue-4.14/revert-parisc-re-enable-interrupts-early.patch queue-4.14/parisc-hide-diva-built-in-serial-aux-and-graphics-card.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "net: mvneta: clear interface link status on port disable" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled net: mvneta: clear interface link status on port disable to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: net-mvneta-clear-interface-link-status-on-port-disable.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 4423c18e466afdfb02a36ee8b9f901d144b3c607 Mon Sep 17 00:00:00 2001 From: Yelena Krivosheev <yelena(a)marvell.com> Date: Tue, 19 Dec 2017 17:59:45 +0100 Subject: net: mvneta: clear interface link status on port disable From: Yelena Krivosheev <yelena(a)marvell.com> commit 4423c18e466afdfb02a36ee8b9f901d144b3c607 upstream. When port connect to PHY in polling mode (with poll interval 1 sec), port and phy link status must be synchronize in order don't loss link change event. [gregory.clement(a)free-electrons.com: add fixes tag] Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit") Signed-off-by: Yelena Krivosheev <yelena(a)marvell.com> Tested-by: Dmitri Epshtein <dima(a)marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement(a)free-electrons.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/net/ethernet/marvell/mvneta.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -1214,6 +1214,10 @@ static void mvneta_port_disable(struct m val &= ~MVNETA_GMAC0_PORT_ENABLE; mvreg_write(pp, MVNETA_GMAC_CTRL_0, val); + pp->link = 0; + pp->duplex = -1; + pp->speed = 0; + udelay(200); } Patches currently in stable-queue which might be from yelena(a)marvell.com are queue-4.14/net-mvneta-eliminate-wrong-call-to-handle-rx-descriptor-error.patch queue-4.14/net-mvneta-clear-interface-link-status-on-port-disable.patch queue-4.14/net-mvneta-use-proper-rxq_number-in-loop-on-rx-queues.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "net: mvneta: eliminate wrong call to handle rx descriptor error" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled net: mvneta: eliminate wrong call to handle rx descriptor error to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: net-mvneta-eliminate-wrong-call-to-handle-rx-descriptor-error.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 2eecb2e04abb62ef8ea7b43e1a46bdb5b99d1bf8 Mon Sep 17 00:00:00 2001 From: Yelena Krivosheev <yelena(a)marvell.com> Date: Tue, 19 Dec 2017 17:59:47 +0100 Subject: net: mvneta: eliminate wrong call to handle rx descriptor error From: Yelena Krivosheev <yelena(a)marvell.com> commit 2eecb2e04abb62ef8ea7b43e1a46bdb5b99d1bf8 upstream. There are few reasons in mvneta_rx_swbm() function when received packet is dropped. mvneta_rx_error() should be called only if error bit [16] is set in rx descriptor. [gregory.clement(a)free-electrons.com: add fixes tag] Fixes: dc35a10f68d3 ("net: mvneta: bm: add support for hardware buffer management") Signed-off-by: Yelena Krivosheev <yelena(a)marvell.com> Tested-by: Dmitri Epshtein <dima(a)marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement(a)free-electrons.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/net/ethernet/marvell/mvneta.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -1962,9 +1962,9 @@ static int mvneta_rx_swbm(struct mvneta_ if (!mvneta_rxq_desc_is_first_last(rx_status) || (rx_status & MVNETA_RXD_ERR_SUMMARY)) { + mvneta_rx_error(pp, rx_desc); err_drop_frame: dev->stats.rx_errors++; - mvneta_rx_error(pp, rx_desc); /* leave the descriptor untouched */ continue; } Patches currently in stable-queue which might be from yelena(a)marvell.com are queue-4.14/net-mvneta-eliminate-wrong-call-to-handle-rx-descriptor-error.patch queue-4.14/net-mvneta-clear-interface-link-status-on-port-disable.patch queue-4.14/net-mvneta-use-proper-rxq_number-in-loop-on-rx-queues.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "mfd: twl6040: Fix child-node lookup" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled mfd: twl6040: Fix child-node lookup to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: mfd-twl6040-fix-child-node-lookup.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 85e9b13cbb130a3209f21bd7933933399c389ffe Mon Sep 17 00:00:00 2001 From: Johan Hovold <johan(a)kernel.org> Date: Sat, 11 Nov 2017 16:38:44 +0100 Subject: mfd: twl6040: Fix child-node lookup From: Johan Hovold <johan(a)kernel.org> commit 85e9b13cbb130a3209f21bd7933933399c389ffe upstream. Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at the parent rather than just matching on its children. To make things worse, the parent node was prematurely freed, while the child node was leaked. Note that the CONFIG_OF compile guard can be removed as of_get_child_by_name() provides a !CONFIG_OF implementation which always fails. Fixes: 37e13cecaa14 ("mfd: Add support for Device Tree to twl6040") Fixes: ca2cad6ae38e ("mfd: Fix twl6040 build failure") Signed-off-by: Johan Hovold <johan(a)kernel.org> Acked-by: Peter Ujfalusi <peter.ujfalusi(a)ti.com> Signed-off-by: Lee Jones <lee.jones(a)linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/mfd/twl6040.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) --- a/drivers/mfd/twl6040.c +++ b/drivers/mfd/twl6040.c @@ -97,12 +97,16 @@ static struct reg_sequence twl6040_patch }; -static bool twl6040_has_vibra(struct device_node *node) +static bool twl6040_has_vibra(struct device_node *parent) { -#ifdef CONFIG_OF - if (of_find_node_by_name(node, "vibra")) + struct device_node *node; + + node = of_get_child_by_name(parent, "vibra"); + if (node) { + of_node_put(node); return true; -#endif + } + return false; } Patches currently in stable-queue which might be from johan(a)kernel.org are queue-4.14/mfd-twl4030-audio-fix-sibling-node-lookup.patch queue-4.14/mfd-twl6040-fix-child-node-lookup.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "mfd: twl4030-audio: Fix sibling-node lookup" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled mfd: twl4030-audio: Fix sibling-node lookup to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: mfd-twl4030-audio-fix-sibling-node-lookup.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 0a423772de2f3d7b00899987884f62f63ae00dcb Mon Sep 17 00:00:00 2001 From: Johan Hovold <johan(a)kernel.org> Date: Sat, 11 Nov 2017 16:38:43 +0100 Subject: mfd: twl4030-audio: Fix sibling-node lookup From: Johan Hovold <johan(a)kernel.org> commit 0a423772de2f3d7b00899987884f62f63ae00dcb upstream. A helper purported to look up a child node based on its name was using the wrong of-helper and ended up prematurely freeing the parent of-node while leaking any matching node. To make things worse, any matching node would not even necessarily be a child node as the whole device tree was searched depth-first starting at the parent. Fixes: 019a7e6b7b31 ("mfd: twl4030-audio: Add DT support") Signed-off-by: Johan Hovold <johan(a)kernel.org> Acked-by: Peter Ujfalusi <peter.ujfalusi(a)ti.com> Signed-off-by: Lee Jones <lee.jones(a)linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/mfd/twl4030-audio.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) --- a/drivers/mfd/twl4030-audio.c +++ b/drivers/mfd/twl4030-audio.c @@ -159,13 +159,18 @@ unsigned int twl4030_audio_get_mclk(void EXPORT_SYMBOL_GPL(twl4030_audio_get_mclk); static bool twl4030_audio_has_codec(struct twl4030_audio_data *pdata, - struct device_node *node) + struct device_node *parent) { + struct device_node *node; + if (pdata && pdata->codec) return true; - if (of_find_node_by_name(node, "codec")) + node = of_get_child_by_name(parent, "codec"); + if (node) { + of_node_put(node); return true; + } return false; } Patches currently in stable-queue which might be from johan(a)kernel.org are queue-4.14/mfd-twl4030-audio-fix-sibling-node-lookup.patch queue-4.14/mfd-twl6040-fix-child-node-lookup.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "mfd: cros ec: spi: Don't send first message too soon" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled mfd: cros ec: spi: Don't send first message too soon to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: mfd-cros-ec-spi-don-t-send-first-message-too-soon.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 15d8374874ded0bec37ef27f8301a6d54032c0e5 Mon Sep 17 00:00:00 2001 From: Jon Hunter <jonathanh(a)nvidia.com> Date: Tue, 14 Nov 2017 14:43:27 +0000 Subject: mfd: cros ec: spi: Don't send first message too soon From: Jon Hunter <jonathanh(a)nvidia.com> commit 15d8374874ded0bec37ef27f8301a6d54032c0e5 upstream. On the Tegra124 Nyan-Big chromebook the very first SPI message sent to the EC is failing. The Tegra SPI driver configures the SPI chip-selects to be active-high by default (and always has for many years). The EC SPI requires an active-low chip-select and so the Tegra chip-select is reconfigured to be active-low when the EC SPI driver calls spi_setup(). The problem is that if the first SPI message to the EC is sent too soon after reconfiguring the SPI chip-select, it fails. The EC SPI driver prevents back-to-back SPI messages being sent too soon by keeping track of the time the last transfer was sent via the variable 'last_transfer_ns'. To prevent the very first transfer being sent too soon, initialise the 'last_transfer_ns' variable after calling spi_setup() and before sending the first SPI message. Signed-off-by: Jon Hunter <jonathanh(a)nvidia.com> Reviewed-by: Brian Norris <briannorris(a)chromium.org> Reviewed-by: Douglas Anderson <dianders(a)chromium.org> Acked-by: Benson Leung <bleung(a)chromium.org> Signed-off-by: Lee Jones <lee.jones(a)linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/mfd/cros_ec_spi.c | 1 + 1 file changed, 1 insertion(+) --- a/drivers/mfd/cros_ec_spi.c +++ b/drivers/mfd/cros_ec_spi.c @@ -667,6 +667,7 @@ static int cros_ec_spi_probe(struct spi_ sizeof(struct ec_response_get_protocol_info); ec_dev->dout_size = sizeof(struct ec_host_request); + ec_spi->last_transfer_ns = ktime_get_ns(); err = cros_ec_register(ec_dev); if (err) { Patches currently in stable-queue which might be from jonathanh(a)nvidia.com are queue-4.14/mfd-cros-ec-spi-don-t-send-first-message-too-soon.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "libnvdimm, pfn: fix start_pad handling for aligned namespaces" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled libnvdimm, pfn: fix start_pad handling for aligned namespaces to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: libnvdimm-pfn-fix-start_pad-handling-for-aligned-namespaces.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 19deaa217bc04e83b59b5e8c8229eb0e53ad9efc Mon Sep 17 00:00:00 2001 From: Dan Williams <dan.j.williams(a)intel.com> Date: Tue, 19 Dec 2017 15:07:10 -0800 Subject: libnvdimm, pfn: fix start_pad handling for aligned namespaces From: Dan Williams <dan.j.williams(a)intel.com> commit 19deaa217bc04e83b59b5e8c8229eb0e53ad9efc upstream. The alignment checks at pfn driver startup fail to properly account for the 'start_pad' in the case where the namespace is misaligned relative to its internal alignment. This is typically triggered in 1G aligned namespace, but could theoretically trigger with small namespace alignments. When this triggers the kernel reports messages of the form: dax2.1: bad offset: 0x3c000000 dax disabled align: 0x40000000 Fixes: 1ee6667cd8d1 ("libnvdimm, pfn, dax: fix initialization vs autodetect...") Reported-by: Jane Chu <jane.chu(a)oracle.com> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/nvdimm/pfn_devs.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -364,9 +364,9 @@ struct device *nd_pfn_create(struct nd_r int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) { u64 checksum, offset; - unsigned long align; enum nd_pfn_mode mode; struct nd_namespace_io *nsio; + unsigned long align, start_pad; struct nd_pfn_sb *pfn_sb = nd_pfn->pfn_sb; struct nd_namespace_common *ndns = nd_pfn->ndns; const u8 *parent_uuid = nd_dev_to_uuid(&ndns->dev); @@ -410,6 +410,7 @@ int nd_pfn_validate(struct nd_pfn *nd_pf align = le32_to_cpu(pfn_sb->align); offset = le64_to_cpu(pfn_sb->dataoff); + start_pad = le32_to_cpu(pfn_sb->start_pad); if (align == 0) align = 1UL << ilog2(offset); mode = le32_to_cpu(pfn_sb->mode); @@ -468,7 +469,7 @@ int nd_pfn_validate(struct nd_pfn *nd_pf return -EBUSY; } - if ((align && !IS_ALIGNED(offset, align)) + if ((align && !IS_ALIGNED(nsio->res.start + offset + start_pad, align)) || !IS_ALIGNED(offset, PAGE_SIZE)) { dev_err(&nd_pfn->dev, "bad offset: %#llx dax disabled align: %#lx\n", Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are queue-4.14/libnvdimm-btt-fix-an-incompatibility-in-the-log-layout.patch queue-4.14/libnvdimm-pfn-fix-start_pad-handling-for-aligned-namespaces.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/libnvdimm-dax-fix-1gb-aligned-namespaces-vs-physical-misalignment.patch queue-4.14/acpi-nfit-fix-health-event-notification.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: libnvdimm-dax-fix-1gb-aligned-namespaces-vs-physical-misalignment.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 41fce90f26333c4fa82e8e43b9ace86c4e8a0120 Mon Sep 17 00:00:00 2001 From: Dan Williams <dan.j.williams(a)intel.com> Date: Mon, 4 Dec 2017 14:07:43 -0800 Subject: libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment From: Dan Williams <dan.j.williams(a)intel.com> commit 41fce90f26333c4fa82e8e43b9ace86c4e8a0120 upstream. The following namespace configuration attempt: # ndctl create-namespace -e namespace0.0 -m devdax -a 1G -f libndctl: ndctl_dax_enable: dax0.1: failed to enable Error: namespace0.0: failed to enable failed to reconfigure namespace: No such device or address ...fails when the backing memory range is not physically aligned to 1G: # cat /proc/iomem | grep Persistent 210000000-30fffffff : Persistent Memory (legacy) In the above example the 4G persistent memory range starts and ends on a 256MB boundary. We handle this case correctly when needing to handle cases that violate section alignment (128MB) collisions against "System RAM", and we simply need to extend that padding/truncation for the 1GB alignment use case. Fixes: 315c562536c4 ("libnvdimm, pfn: add 'align' attribute...") Reported-and-tested-by: Jane Chu <jane.chu(a)oracle.com> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/nvdimm/pfn_devs.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -582,6 +582,12 @@ static struct vmem_altmap *__nvdimm_setu return altmap; } +static u64 phys_pmem_align_down(struct nd_pfn *nd_pfn, u64 phys) +{ + return min_t(u64, PHYS_SECTION_ALIGN_DOWN(phys), + ALIGN_DOWN(phys, nd_pfn->align)); +} + static int nd_pfn_init(struct nd_pfn *nd_pfn) { u32 dax_label_reserve = is_nd_dax(&nd_pfn->dev) ? SZ_128K : 0; @@ -637,13 +643,16 @@ static int nd_pfn_init(struct nd_pfn *nd start = nsio->res.start; size = PHYS_SECTION_ALIGN_UP(start + size) - start; if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM, - IORES_DESC_NONE) == REGION_MIXED) { + IORES_DESC_NONE) == REGION_MIXED + || !IS_ALIGNED(start + resource_size(&nsio->res), + nd_pfn->align)) { size = resource_size(&nsio->res); - end_trunc = start + size - PHYS_SECTION_ALIGN_DOWN(start + size); + end_trunc = start + size - phys_pmem_align_down(nd_pfn, + start + size); } if (start_pad + end_trunc) - dev_info(&nd_pfn->dev, "%s section collision, truncate %d bytes\n", + dev_info(&nd_pfn->dev, "%s alignment collision, truncate %d bytes\n", dev_name(&ndns->dev), start_pad + end_trunc); /* Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are queue-4.14/libnvdimm-btt-fix-an-incompatibility-in-the-log-layout.patch queue-4.14/libnvdimm-pfn-fix-start_pad-handling-for-aligned-namespaces.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/libnvdimm-dax-fix-1gb-aligned-namespaces-vs-physical-misalignment.patch queue-4.14/acpi-nfit-fix-health-event-notification.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "libnvdimm, btt: Fix an incompatibility in the log layout" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled libnvdimm, btt: Fix an incompatibility in the log layout to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: libnvdimm-btt-fix-an-incompatibility-in-the-log-layout.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 24e3a7fb60a9187e5df90e5fa655ffc94b9c4f77 Mon Sep 17 00:00:00 2001 From: Vishal Verma <vishal.l.verma(a)intel.com> Date: Mon, 18 Dec 2017 09:28:39 -0700 Subject: libnvdimm, btt: Fix an incompatibility in the log layout From: Vishal Verma <vishal.l.verma(a)intel.com> commit 24e3a7fb60a9187e5df90e5fa655ffc94b9c4f77 upstream. Due to a spec misinterpretation, the Linux implementation of the BTT log area had different padding scheme from other implementations, such as UEFI and NVML. This fixes the padding scheme, and defaults to it for new BTT layouts. We attempt to detect the padding scheme in use when probing for an existing BTT. If we detect the older/incompatible scheme, we continue using it. Reported-by: Juston Li <juston.li(a)intel.com> Cc: Dan Williams <dan.j.williams(a)intel.com> Fixes: 5212e11fde4d ("nd_btt: atomic sector updates") Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/nvdimm/btt.c | 201 ++++++++++++++++++++++++++++++++++++++++++--------- drivers/nvdimm/btt.h | 45 +++++++++++ 2 files changed, 211 insertions(+), 35 deletions(-) --- a/drivers/nvdimm/btt.c +++ b/drivers/nvdimm/btt.c @@ -210,12 +210,12 @@ static int btt_map_read(struct arena_inf return ret; } -static int btt_log_read_pair(struct arena_info *arena, u32 lane, - struct log_entry *ent) +static int btt_log_group_read(struct arena_info *arena, u32 lane, + struct log_group *log) { return arena_read_bytes(arena, - arena->logoff + (2 * lane * LOG_ENT_SIZE), ent, - 2 * LOG_ENT_SIZE, 0); + arena->logoff + (lane * LOG_GRP_SIZE), log, + LOG_GRP_SIZE, 0); } static struct dentry *debugfs_root; @@ -255,6 +255,8 @@ static void arena_debugfs_init(struct ar debugfs_create_x64("logoff", S_IRUGO, d, &a->logoff); debugfs_create_x64("info2off", S_IRUGO, d, &a->info2off); debugfs_create_x32("flags", S_IRUGO, d, &a->flags); + debugfs_create_u32("log_index_0", S_IRUGO, d, &a->log_index[0]); + debugfs_create_u32("log_index_1", S_IRUGO, d, &a->log_index[1]); } static void btt_debugfs_init(struct btt *btt) @@ -273,6 +275,11 @@ static void btt_debugfs_init(struct btt } } +static u32 log_seq(struct log_group *log, int log_idx) +{ + return le32_to_cpu(log->ent[log_idx].seq); +} + /* * This function accepts two log entries, and uses the * sequence number to find the 'older' entry. @@ -282,8 +289,10 @@ static void btt_debugfs_init(struct btt * * TODO The logic feels a bit kludge-y. make it better.. */ -static int btt_log_get_old(struct log_entry *ent) +static int btt_log_get_old(struct arena_info *a, struct log_group *log) { + int idx0 = a->log_index[0]; + int idx1 = a->log_index[1]; int old; /* @@ -291,23 +300,23 @@ static int btt_log_get_old(struct log_en * the next time, the following logic works out to put this * (next) entry into [1] */ - if (ent[0].seq == 0) { - ent[0].seq = cpu_to_le32(1); + if (log_seq(log, idx0) == 0) { + log->ent[idx0].seq = cpu_to_le32(1); return 0; } - if (ent[0].seq == ent[1].seq) + if (log_seq(log, idx0) == log_seq(log, idx1)) return -EINVAL; - if (le32_to_cpu(ent[0].seq) + le32_to_cpu(ent[1].seq) > 5) + if (log_seq(log, idx0) + log_seq(log, idx1) > 5) return -EINVAL; - if (le32_to_cpu(ent[0].seq) < le32_to_cpu(ent[1].seq)) { - if (le32_to_cpu(ent[1].seq) - le32_to_cpu(ent[0].seq) == 1) + if (log_seq(log, idx0) < log_seq(log, idx1)) { + if ((log_seq(log, idx1) - log_seq(log, idx0)) == 1) old = 0; else old = 1; } else { - if (le32_to_cpu(ent[0].seq) - le32_to_cpu(ent[1].seq) == 1) + if ((log_seq(log, idx0) - log_seq(log, idx1)) == 1) old = 1; else old = 0; @@ -327,17 +336,18 @@ static int btt_log_read(struct arena_inf { int ret; int old_ent, ret_ent; - struct log_entry log[2]; + struct log_group log; - ret = btt_log_read_pair(arena, lane, log); + ret = btt_log_group_read(arena, lane, &log); if (ret) return -EIO; - old_ent = btt_log_get_old(log); + old_ent = btt_log_get_old(arena, &log); if (old_ent < 0 || old_ent > 1) { dev_err(to_dev(arena), "log corruption (%d): lane %d seq [%d, %d]\n", - old_ent, lane, log[0].seq, log[1].seq); + old_ent, lane, log.ent[arena->log_index[0]].seq, + log.ent[arena->log_index[1]].seq); /* TODO set error state? */ return -EIO; } @@ -345,7 +355,7 @@ static int btt_log_read(struct arena_inf ret_ent = (old_flag ? old_ent : (1 - old_ent)); if (ent != NULL) - memcpy(ent, &log[ret_ent], LOG_ENT_SIZE); + memcpy(ent, &log.ent[arena->log_index[ret_ent]], LOG_ENT_SIZE); return ret_ent; } @@ -359,17 +369,13 @@ static int __btt_log_write(struct arena_ u32 sub, struct log_entry *ent, unsigned long flags) { int ret; - /* - * Ignore the padding in log_entry for calculating log_half. - * The entry is 'committed' when we write the sequence number, - * and we want to ensure that that is the last thing written. - * We don't bother writing the padding as that would be extra - * media wear and write amplification - */ - unsigned int log_half = (LOG_ENT_SIZE - 2 * sizeof(u64)) / 2; - u64 ns_off = arena->logoff + (((2 * lane) + sub) * LOG_ENT_SIZE); + u32 group_slot = arena->log_index[sub]; + unsigned int log_half = LOG_ENT_SIZE / 2; void *src = ent; + u64 ns_off; + ns_off = arena->logoff + (lane * LOG_GRP_SIZE) + + (group_slot * LOG_ENT_SIZE); /* split the 16B write into atomic, durable halves */ ret = arena_write_bytes(arena, ns_off, src, log_half, flags); if (ret) @@ -452,7 +458,7 @@ static int btt_log_init(struct arena_inf { size_t logsize = arena->info2off - arena->logoff; size_t chunk_size = SZ_4K, offset = 0; - struct log_entry log; + struct log_entry ent; void *zerobuf; int ret; u32 i; @@ -484,11 +490,11 @@ static int btt_log_init(struct arena_inf } for (i = 0; i < arena->nfree; i++) { - log.lba = cpu_to_le32(i); - log.old_map = cpu_to_le32(arena->external_nlba + i); - log.new_map = cpu_to_le32(arena->external_nlba + i); - log.seq = cpu_to_le32(LOG_SEQ_INIT); - ret = __btt_log_write(arena, i, 0, &log, 0); + ent.lba = cpu_to_le32(i); + ent.old_map = cpu_to_le32(arena->external_nlba + i); + ent.new_map = cpu_to_le32(arena->external_nlba + i); + ent.seq = cpu_to_le32(LOG_SEQ_INIT); + ret = __btt_log_write(arena, i, 0, &ent, 0); if (ret) goto free; } @@ -593,6 +599,123 @@ static int btt_freelist_init(struct aren return 0; } +static bool ent_is_padding(struct log_entry *ent) +{ + return (ent->lba == 0) && (ent->old_map == 0) && (ent->new_map == 0) + && (ent->seq == 0); +} + +/* + * Detecting valid log indices: We read a log group (see the comments in btt.h + * for a description of a 'log_group' and its 'slots'), and iterate over its + * four slots. We expect that a padding slot will be all-zeroes, and use this + * to detect a padding slot vs. an actual entry. + * + * If a log_group is in the initial state, i.e. hasn't been used since the + * creation of this BTT layout, it will have three of the four slots with + * zeroes. We skip over these log_groups for the detection of log_index. If + * all log_groups are in the initial state (i.e. the BTT has never been + * written to), it is safe to assume the 'new format' of log entries in slots + * (0, 1). + */ +static int log_set_indices(struct arena_info *arena) +{ + bool idx_set = false, initial_state = true; + int ret, log_index[2] = {-1, -1}; + u32 i, j, next_idx = 0; + struct log_group log; + u32 pad_count = 0; + + for (i = 0; i < arena->nfree; i++) { + ret = btt_log_group_read(arena, i, &log); + if (ret < 0) + return ret; + + for (j = 0; j < 4; j++) { + if (!idx_set) { + if (ent_is_padding(&log.ent[j])) { + pad_count++; + continue; + } else { + /* Skip if index has been recorded */ + if ((next_idx == 1) && + (j == log_index[0])) + continue; + /* valid entry, record index */ + log_index[next_idx] = j; + next_idx++; + } + if (next_idx == 2) { + /* two valid entries found */ + idx_set = true; + } else if (next_idx > 2) { + /* too many valid indices */ + return -ENXIO; + } + } else { + /* + * once the indices have been set, just verify + * that all subsequent log groups are either in + * their initial state or follow the same + * indices. + */ + if (j == log_index[0]) { + /* entry must be 'valid' */ + if (ent_is_padding(&log.ent[j])) + return -ENXIO; + } else if (j == log_index[1]) { + ; + /* + * log_index[1] can be padding if the + * lane never got used and it is still + * in the initial state (three 'padding' + * entries) + */ + } else { + /* entry must be invalid (padding) */ + if (!ent_is_padding(&log.ent[j])) + return -ENXIO; + } + } + } + /* + * If any of the log_groups have more than one valid, + * non-padding entry, then the we are no longer in the + * initial_state + */ + if (pad_count < 3) + initial_state = false; + pad_count = 0; + } + + if (!initial_state && !idx_set) + return -ENXIO; + + /* + * If all the entries in the log were in the initial state, + * assume new padding scheme + */ + if (initial_state) + log_index[1] = 1; + + /* + * Only allow the known permutations of log/padding indices, + * i.e. (0, 1), and (0, 2) + */ + if ((log_index[0] == 0) && ((log_index[1] == 1) || (log_index[1] == 2))) + ; /* known index possibilities */ + else { + dev_err(to_dev(arena), "Found an unknown padding scheme\n"); + return -ENXIO; + } + + arena->log_index[0] = log_index[0]; + arena->log_index[1] = log_index[1]; + dev_dbg(to_dev(arena), "log_index_0 = %d\n", log_index[0]); + dev_dbg(to_dev(arena), "log_index_1 = %d\n", log_index[1]); + return 0; +} + static int btt_rtt_init(struct arena_info *arena) { arena->rtt = kcalloc(arena->nfree, sizeof(u32), GFP_KERNEL); @@ -649,8 +772,7 @@ static struct arena_info *alloc_arena(st available -= 2 * BTT_PG_SIZE; /* The log takes a fixed amount of space based on nfree */ - logsize = roundup(2 * arena->nfree * sizeof(struct log_entry), - BTT_PG_SIZE); + logsize = roundup(arena->nfree * LOG_GRP_SIZE, BTT_PG_SIZE); available -= logsize; /* Calculate optimal split between map and data area */ @@ -667,6 +789,10 @@ static struct arena_info *alloc_arena(st arena->mapoff = arena->dataoff + datasize; arena->logoff = arena->mapoff + mapsize; arena->info2off = arena->logoff + logsize; + + /* Default log indices are (0,1) */ + arena->log_index[0] = 0; + arena->log_index[1] = 1; return arena; } @@ -757,6 +883,13 @@ static int discover_arenas(struct btt *b arena->external_lba_start = cur_nlba; parse_arena_meta(arena, super, cur_off); + ret = log_set_indices(arena); + if (ret) { + dev_err(to_dev(arena), + "Unable to deduce log/padding indices\n"); + goto out; + } + mutex_init(&arena->err_lock); ret = btt_freelist_init(arena); if (ret) --- a/drivers/nvdimm/btt.h +++ b/drivers/nvdimm/btt.h @@ -27,6 +27,7 @@ #define MAP_ERR_MASK (1 << MAP_ERR_SHIFT) #define MAP_LBA_MASK (~((1 << MAP_TRIM_SHIFT) | (1 << MAP_ERR_SHIFT))) #define MAP_ENT_NORMAL 0xC0000000 +#define LOG_GRP_SIZE sizeof(struct log_group) #define LOG_ENT_SIZE sizeof(struct log_entry) #define ARENA_MIN_SIZE (1UL << 24) /* 16 MB */ #define ARENA_MAX_SIZE (1ULL << 39) /* 512 GB */ @@ -50,12 +51,52 @@ enum btt_init_state { INIT_READY }; +/* + * A log group represents one log 'lane', and consists of four log entries. + * Two of the four entries are valid entries, and the remaining two are + * padding. Due to an old bug in the padding location, we need to perform a + * test to determine the padding scheme being used, and use that scheme + * thereafter. + * + * In kernels prior to 4.15, 'log group' would have actual log entries at + * indices (0, 2) and padding at indices (1, 3), where as the correct/updated + * format has log entries at indices (0, 1) and padding at indices (2, 3). + * + * Old (pre 4.15) format: + * +-----------------+-----------------+ + * | ent[0] | ent[1] | + * | 16B | 16B | + * | lba/old/new/seq | pad | + * +-----------------------------------+ + * | ent[2] | ent[3] | + * | 16B | 16B | + * | lba/old/new/seq | pad | + * +-----------------+-----------------+ + * + * New format: + * +-----------------+-----------------+ + * | ent[0] | ent[1] | + * | 16B | 16B | + * | lba/old/new/seq | lba/old/new/seq | + * +-----------------------------------+ + * | ent[2] | ent[3] | + * | 16B | 16B | + * | pad | pad | + * +-----------------+-----------------+ + * + * We detect during start-up which format is in use, and set + * arena->log_index[(0, 1)] with the detected format. + */ + struct log_entry { __le32 lba; __le32 old_map; __le32 new_map; __le32 seq; - __le64 padding[2]; +}; + +struct log_group { + struct log_entry ent[4]; }; struct btt_sb { @@ -125,6 +166,7 @@ struct aligned_lock { * @list: List head for list of arenas * @debugfs_dir: Debugfs dentry * @flags: Arena flags - may signify error states. + * @log_index: Indices of the valid log entries in a log_group * * arena_info is a per-arena handle. Once an arena is narrowed down for an * IO, this struct is passed around for the duration of the IO. @@ -157,6 +199,7 @@ struct arena_info { /* Arena flags */ u32 flags; struct mutex err_lock; + int log_index[2]; }; /** Patches currently in stable-queue which might be from vishal.l.verma(a)intel.com are queue-4.14/libnvdimm-btt-fix-an-incompatibility-in-the-log-layout.patch queue-4.14/acpi-nfit-fix-health-event-notification.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "kvm: x86: fix RSM when PCID is non-zero" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled kvm: x86: fix RSM when PCID is non-zero to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: kvm-x86-fix-rsm-when-pcid-is-non-zero.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From fae1a3e775cca8c3a9e0eb34443b310871a15a92 Mon Sep 17 00:00:00 2001 From: Paolo Bonzini <pbonzini(a)redhat.com> Date: Thu, 21 Dec 2017 00:49:14 +0100 Subject: kvm: x86: fix RSM when PCID is non-zero From: Paolo Bonzini <pbonzini(a)redhat.com> commit fae1a3e775cca8c3a9e0eb34443b310871a15a92 upstream. rsm_load_state_64() and rsm_enter_protected_mode() load CR3, then CR4 & ~PCIDE, then CR0, then CR4. However, setting CR4.PCIDE fails if CR3[11:0] != 0. It's probably easier in the long run to replace rsm_enter_protected_mode() with an emulator callback that sets all the special registers (like KVM_SET_SREGS would do). For now, set the PCID field of CR3 only after CR4.PCIDE is 1. Reported-by: Laszlo Ersek <lersek(a)redhat.com> Tested-by: Laszlo Ersek <lersek(a)redhat.com> Fixes: 660a5d517aaab9187f93854425c4c63f4a09195c Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/kvm/emulate.c | 32 +++++++++++++++++++++++++------- 1 file changed, 25 insertions(+), 7 deletions(-) --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -2404,9 +2404,21 @@ static int rsm_load_seg_64(struct x86_em } static int rsm_enter_protected_mode(struct x86_emulate_ctxt *ctxt, - u64 cr0, u64 cr4) + u64 cr0, u64 cr3, u64 cr4) { int bad; + u64 pcid; + + /* In order to later set CR4.PCIDE, CR3[11:0] must be zero. */ + pcid = 0; + if (cr4 & X86_CR4_PCIDE) { + pcid = cr3 & 0xfff; + cr3 &= ~0xfff; + } + + bad = ctxt->ops->set_cr(ctxt, 3, cr3); + if (bad) + return X86EMUL_UNHANDLEABLE; /* * First enable PAE, long mode needs it before CR0.PG = 1 is set. @@ -2425,6 +2437,12 @@ static int rsm_enter_protected_mode(stru bad = ctxt->ops->set_cr(ctxt, 4, cr4); if (bad) return X86EMUL_UNHANDLEABLE; + if (pcid) { + bad = ctxt->ops->set_cr(ctxt, 3, cr3 | pcid); + if (bad) + return X86EMUL_UNHANDLEABLE; + } + } return X86EMUL_CONTINUE; @@ -2435,11 +2453,11 @@ static int rsm_load_state_32(struct x86_ struct desc_struct desc; struct desc_ptr dt; u16 selector; - u32 val, cr0, cr4; + u32 val, cr0, cr3, cr4; int i; cr0 = GET_SMSTATE(u32, smbase, 0x7ffc); - ctxt->ops->set_cr(ctxt, 3, GET_SMSTATE(u32, smbase, 0x7ff8)); + cr3 = GET_SMSTATE(u32, smbase, 0x7ff8); ctxt->eflags = GET_SMSTATE(u32, smbase, 0x7ff4) | X86_EFLAGS_FIXED; ctxt->_eip = GET_SMSTATE(u32, smbase, 0x7ff0); @@ -2481,14 +2499,14 @@ static int rsm_load_state_32(struct x86_ ctxt->ops->set_smbase(ctxt, GET_SMSTATE(u32, smbase, 0x7ef8)); - return rsm_enter_protected_mode(ctxt, cr0, cr4); + return rsm_enter_protected_mode(ctxt, cr0, cr3, cr4); } static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt, u64 smbase) { struct desc_struct desc; struct desc_ptr dt; - u64 val, cr0, cr4; + u64 val, cr0, cr3, cr4; u32 base3; u16 selector; int i, r; @@ -2505,7 +2523,7 @@ static int rsm_load_state_64(struct x86_ ctxt->ops->set_dr(ctxt, 7, (val & DR7_VOLATILE) | DR7_FIXED_1); cr0 = GET_SMSTATE(u64, smbase, 0x7f58); - ctxt->ops->set_cr(ctxt, 3, GET_SMSTATE(u64, smbase, 0x7f50)); + cr3 = GET_SMSTATE(u64, smbase, 0x7f50); cr4 = GET_SMSTATE(u64, smbase, 0x7f48); ctxt->ops->set_smbase(ctxt, GET_SMSTATE(u32, smbase, 0x7f00)); val = GET_SMSTATE(u64, smbase, 0x7ed0); @@ -2533,7 +2551,7 @@ static int rsm_load_state_64(struct x86_ dt.address = GET_SMSTATE(u64, smbase, 0x7e68); ctxt->ops->set_gdt(ctxt, &dt); - r = rsm_enter_protected_mode(ctxt, cr0, cr4); + r = rsm_enter_protected_mode(ctxt, cr0, cr3, cr4); if (r != X86EMUL_CONTINUE) return r; Patches currently in stable-queue which might be from pbonzini(a)redhat.com are queue-4.14/kvm-mmu-fix-infinite-loop-when-there-is-no-available-mmu-page.patch queue-4.14/kvm-x86-fix-load-rflags-w-o-the-fixed-bit.patch queue-4.14/kvm-x86-fix-rsm-when-pcid-is-non-zero.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "KVM: X86: Fix load RFLAGS w/o the fixed bit" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled KVM: X86: Fix load RFLAGS w/o the fixed bit to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: kvm-x86-fix-load-rflags-w-o-the-fixed-bit.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From d73235d17ba63b53dc0e1051dbc10a1f1be91b71 Mon Sep 17 00:00:00 2001 From: Wanpeng Li <wanpeng.li(a)hotmail.com> Date: Thu, 7 Dec 2017 00:30:08 -0800 Subject: KVM: X86: Fix load RFLAGS w/o the fixed bit MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Wanpeng Li <wanpeng.li(a)hotmail.com> commit d73235d17ba63b53dc0e1051dbc10a1f1be91b71 upstream. *** Guest State *** CR0: actual=0x0000000000000030, shadow=0x0000000060000010, gh_mask=fffffffffffffff7 CR4: actual=0x0000000000002050, shadow=0x0000000000000000, gh_mask=ffffffffffffe871 CR3 = 0x00000000fffbc000 RSP = 0x0000000000000000 RIP = 0x0000000000000000 RFLAGS=0x00000000 DR7 = 0x0000000000000400 ^^^^^^^^^^ The failed vmentry is triggered by the following testcase when ept=Y: #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[5]; int main() { r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); struct kvm_regs regs = { .rflags = 0, }; ioctl(r[4], KVM_SET_REGS, &regs); ioctl(r[4], KVM_RUN, 0); } X86 RFLAGS bit 1 is fixed set, userspace can simply clearing bit 1 of RFLAGS with KVM_SET_REGS ioctl which results in vmentry fails. This patch fixes it by oring X86_EFLAGS_FIXED during ioctl. Suggested-by: Jim Mattson <jmattson(a)google.com> Reviewed-by: David Hildenbrand <david(a)redhat.com> Reviewed-by: Quan Xu <quan.xu0(a)gmail.com> Cc: Paolo Bonzini <pbonzini(a)redhat.com> Cc: Radim Krčmář <rkrcmar(a)redhat.com> Cc: Jim Mattson <jmattson(a)google.com> Signed-off-by: Wanpeng Li <wanpeng.li(a)hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/kvm/x86.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7359,7 +7359,7 @@ int kvm_arch_vcpu_ioctl_set_regs(struct #endif kvm_rip_write(vcpu, regs->rip); - kvm_set_rflags(vcpu, regs->rflags); + kvm_set_rflags(vcpu, regs->rflags | X86_EFLAGS_FIXED); vcpu->arch.exception.pending = false; Patches currently in stable-queue which might be from wanpeng.li(a)hotmail.com are queue-4.14/kvm-mmu-fix-infinite-loop-when-there-is-no-available-mmu-page.patch queue-4.14/kvm-x86-fix-load-rflags-w-o-the-fixed-bit.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "KVM: PPC: Book3S HV: Fix pending_pri value in kvmppc_xive_get_icp()" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled KVM: PPC: Book3S HV: Fix pending_pri value in kvmppc_xive_get_icp() to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: kvm-ppc-book3s-hv-fix-pending_pri-value-in-kvmppc_xive_get_icp.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 7333b5aca412d6ad02667b5a513485838a91b136 Mon Sep 17 00:00:00 2001 From: Laurent Vivier <lvivier(a)redhat.com> Date: Tue, 12 Dec 2017 18:23:56 +0100 Subject: KVM: PPC: Book3S HV: Fix pending_pri value in kvmppc_xive_get_icp() From: Laurent Vivier <lvivier(a)redhat.com> commit 7333b5aca412d6ad02667b5a513485838a91b136 upstream. When we migrate a VM from a POWER8 host (XICS) to a POWER9 host (XICS-on-XIVE), we have an error: qemu-kvm: Unable to restore KVM interrupt controller state \ (0xff000000) for CPU 0: Invalid argument This is because kvmppc_xics_set_icp() checks the new state is internaly consistent, and especially: ... 1129 if (xisr == 0) { 1130 if (pending_pri != 0xff) 1131 return -EINVAL; ... On the other side, kvmppc_xive_get_icp() doesn't set neither the pending_pri value, nor the xisr value (set to 0) (and kvmppc_xive_set_icp() ignores the pending_pri value) As xisr is 0, pending_pri must be set to 0xff. Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") Signed-off-by: Laurent Vivier <lvivier(a)redhat.com> Acked-by: Benjamin Herrenschmidt <benh(a)kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/powerpc/kvm/book3s_xive.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/arch/powerpc/kvm/book3s_xive.c +++ b/arch/powerpc/kvm/book3s_xive.c @@ -725,7 +725,8 @@ u64 kvmppc_xive_get_icp(struct kvm_vcpu /* Return the per-cpu state for state saving/migration */ return (u64)xc->cppr << KVM_REG_PPC_ICP_CPPR_SHIFT | - (u64)xc->mfrr << KVM_REG_PPC_ICP_MFRR_SHIFT; + (u64)xc->mfrr << KVM_REG_PPC_ICP_MFRR_SHIFT | + (u64)0xff << KVM_REG_PPC_ICP_PPRI_SHIFT; } int kvmppc_xive_set_icp(struct kvm_vcpu *vcpu, u64 icpval) Patches currently in stable-queue which might be from lvivier(a)redhat.com are queue-4.14/kvm-ppc-book3s-hv-fix-pending_pri-value-in-kvmppc_xive_get_icp.patch queue-4.14/kvm-ppc-book3s-fix-xive-migration-of-pending-interrupts.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "KVM: MMU: Fix infinite loop when there is no available mmu page" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled KVM: MMU: Fix infinite loop when there is no available mmu page to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: kvm-mmu-fix-infinite-loop-when-there-is-no-available-mmu-page.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From ed52870f4676489124d8697fd00e6ae6c504e586 Mon Sep 17 00:00:00 2001 From: Wanpeng Li <wanpeng.li(a)hotmail.com> Date: Mon, 4 Dec 2017 22:21:30 -0800 Subject: KVM: MMU: Fix infinite loop when there is no available mmu page MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Wanpeng Li <wanpeng.li(a)hotmail.com> commit ed52870f4676489124d8697fd00e6ae6c504e586 upstream. The below test case can cause infinite loop in kvm when ept=0. #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[5]; int main() { r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); ioctl(r[4], KVM_RUN, 0); } It doesn't setup the memory regions, mmu_alloc_shadow/direct_roots() in kvm return 1 when kvm fails to allocate root page table which can result in beblow infinite loop: vcpu_run() { for (;;) { r = vcpu_enter_guest()::kvm_mmu_reload() returns 1 if (r <= 0) break; if (need_resched()) cond_resched(); } } This patch fixes it by returning -ENOSPC when there is no available kvm mmu page for root page table. Cc: Paolo Bonzini <pbonzini(a)redhat.com> Cc: Radim Krčmář <rkrcmar(a)redhat.com> Fixes: 26eeb53cf0f (KVM: MMU: Bail out immediately if there is no available mmu page) Signed-off-by: Wanpeng Li <wanpeng.li(a)hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/kvm/mmu.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3382,7 +3382,7 @@ static int mmu_alloc_direct_roots(struct spin_lock(&vcpu->kvm->mmu_lock); if(make_mmu_pages_available(vcpu) < 0) { spin_unlock(&vcpu->kvm->mmu_lock); - return 1; + return -ENOSPC; } sp = kvm_mmu_get_page(vcpu, 0, 0, vcpu->arch.mmu.shadow_root_level, 1, ACC_ALL); @@ -3397,7 +3397,7 @@ static int mmu_alloc_direct_roots(struct spin_lock(&vcpu->kvm->mmu_lock); if (make_mmu_pages_available(vcpu) < 0) { spin_unlock(&vcpu->kvm->mmu_lock); - return 1; + return -ENOSPC; } sp = kvm_mmu_get_page(vcpu, i << (30 - PAGE_SHIFT), i << 30, PT32_ROOT_LEVEL, 1, ACC_ALL); @@ -3437,7 +3437,7 @@ static int mmu_alloc_shadow_roots(struct spin_lock(&vcpu->kvm->mmu_lock); if (make_mmu_pages_available(vcpu) < 0) { spin_unlock(&vcpu->kvm->mmu_lock); - return 1; + return -ENOSPC; } sp = kvm_mmu_get_page(vcpu, root_gfn, 0, vcpu->arch.mmu.shadow_root_level, 0, ACC_ALL); @@ -3474,7 +3474,7 @@ static int mmu_alloc_shadow_roots(struct spin_lock(&vcpu->kvm->mmu_lock); if (make_mmu_pages_available(vcpu) < 0) { spin_unlock(&vcpu->kvm->mmu_lock); - return 1; + return -ENOSPC; } sp = kvm_mmu_get_page(vcpu, root_gfn, i << 30, PT32_ROOT_LEVEL, 0, ACC_ALL); Patches currently in stable-queue which might be from wanpeng.li(a)hotmail.com are queue-4.14/kvm-mmu-fix-infinite-loop-when-there-is-no-available-mmu-page.patch queue-4.14/kvm-x86-fix-load-rflags-w-o-the-fixed-bit.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "KVM: PPC: Book3S: fix XIVE migration of pending interrupts" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled KVM: PPC: Book3S: fix XIVE migration of pending interrupts to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: kvm-ppc-book3s-fix-xive-migration-of-pending-interrupts.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From dc1c4165d189350cb51bdd3057deb6ecd164beda Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg(a)kaod.org> Date: Tue, 12 Dec 2017 12:02:04 +0000 Subject: KVM: PPC: Book3S: fix XIVE migration of pending interrupts MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Cédric Le Goater <clg(a)kaod.org> commit dc1c4165d189350cb51bdd3057deb6ecd164beda upstream. When restoring a pending interrupt, we are setting the Q bit to force a retrigger in xive_finish_unmask(). But we also need to force an EOI in this case to reach the same initial state : P=1, Q=0. This can be done by not setting 'old_p' for pending interrupts which will inform xive_finish_unmask() that an EOI needs to be sent. Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") Suggested-by: Benjamin Herrenschmidt <benh(a)kernel.crashing.org> Signed-off-by: Cédric Le Goater <clg(a)kaod.org> Reviewed-by: Laurent Vivier <lvivier(a)redhat.com> Tested-by: Laurent Vivier <lvivier(a)redhat.com> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/powerpc/kvm/book3s_xive.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/arch/powerpc/kvm/book3s_xive.c +++ b/arch/powerpc/kvm/book3s_xive.c @@ -1558,7 +1558,7 @@ static int xive_set_source(struct kvmppc /* * Restore P and Q. If the interrupt was pending, we - * force both P and Q, which will trigger a resend. + * force Q and !P, which will trigger a resend. * * That means that a guest that had both an interrupt * pending (queued) and Q set will restore with only @@ -1566,7 +1566,7 @@ static int xive_set_source(struct kvmppc * is perfectly fine as coalescing interrupts that haven't * been presented yet is always allowed. */ - if (val & KVM_XICS_PRESENTED || val & KVM_XICS_PENDING) + if (val & KVM_XICS_PRESENTED && !(val & KVM_XICS_PENDING)) state->old_p = true; if (val & KVM_XICS_QUEUED || val & KVM_XICS_PENDING) state->old_q = true; Patches currently in stable-queue which might be from clg(a)kaod.org are queue-4.14/kvm-ppc-book3s-fix-xive-migration-of-pending-interrupts.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "KVM: arm/arm64: Fix HYP unmapping going off limits" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled KVM: arm/arm64: Fix HYP unmapping going off limits to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: kvm-arm-arm64-fix-hyp-unmapping-going-off-limits.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 7839c672e58bf62da8f2f0197fefb442c02ba1dd Mon Sep 17 00:00:00 2001 From: Marc Zyngier <marc.zyngier(a)arm.com> Date: Thu, 7 Dec 2017 11:45:45 +0000 Subject: KVM: arm/arm64: Fix HYP unmapping going off limits From: Marc Zyngier <marc.zyngier(a)arm.com> commit 7839c672e58bf62da8f2f0197fefb442c02ba1dd upstream. When we unmap the HYP memory, we try to be clever and unmap one PGD at a time. If we start with a non-PGD aligned address and try to unmap a whole PGD, things go horribly wrong in unmap_hyp_range (addr and end can never match, and it all goes really badly as we keep incrementing pgd and parse random memory as page tables...). The obvious fix is to let unmap_hyp_range do what it does best, which is to iterate over a range. The size of the linear mapping, which begins at PAGE_OFFSET, can be easily calculated by subtracting PAGE_OFFSET form high_memory, because high_memory is defined as the linear map address of the last byte of DRAM, plus one. The size of the vmalloc region is given trivially by VMALLOC_END - VMALLOC_START. Reported-by: Andre Przywara <andre.przywara(a)arm.com> Tested-by: Andre Przywara <andre.przywara(a)arm.com> Reviewed-by: Christoffer Dall <christoffer.dall(a)linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier(a)arm.com> Signed-off-by: Christoffer Dall <christoffer.dall(a)linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- virt/kvm/arm/mmu.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) --- a/virt/kvm/arm/mmu.c +++ b/virt/kvm/arm/mmu.c @@ -509,8 +509,6 @@ static void unmap_hyp_range(pgd_t *pgdp, */ void free_hyp_pgds(void) { - unsigned long addr; - mutex_lock(&kvm_hyp_pgd_mutex); if (boot_hyp_pgd) { @@ -521,10 +519,10 @@ void free_hyp_pgds(void) if (hyp_pgd) { unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE); - for (addr = PAGE_OFFSET; virt_addr_valid(addr); addr += PGDIR_SIZE) - unmap_hyp_range(hyp_pgd, kern_hyp_va(addr), PGDIR_SIZE); - for (addr = VMALLOC_START; is_vmalloc_addr((void*)addr); addr += PGDIR_SIZE) - unmap_hyp_range(hyp_pgd, kern_hyp_va(addr), PGDIR_SIZE); + unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET), + (uintptr_t)high_memory - PAGE_OFFSET); + unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START), + VMALLOC_END - VMALLOC_START); free_pages((unsigned long)hyp_pgd, hyp_pgd_order); hyp_pgd = NULL; Patches currently in stable-queue which might be from marc.zyngier(a)arm.com are queue-4.14/arm64-kvm-prevent-restoring-stale-pmscr_el1-for-vcpu.patch queue-4.14/kvm-arm-arm64-fix-hyp-unmapping-going-off-limits.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "init: Invoke init_espfix_bsp() from mm_init()" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled init: Invoke init_espfix_bsp() from mm_init() to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: init-invoke-init_espfix_bsp-from-mm_init.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 613e396bc0d4c7604fba23256644e78454c68cf6 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner <tglx(a)linutronix.de> Date: Sun, 17 Dec 2017 10:56:29 +0100 Subject: init: Invoke init_espfix_bsp() from mm_init() From: Thomas Gleixner <tglx(a)linutronix.de> commit 613e396bc0d4c7604fba23256644e78454c68cf6 upstream. init_espfix_bsp() needs to be invoked before the page table isolation initialization. Move it into mm_init() which is the place where pti_init() will be added. While at it get rid of the #ifdeffery and provide proper stub functions. Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/include/asm/espfix.h | 7 ++++--- arch/x86/kernel/smpboot.c | 6 +----- include/asm-generic/pgtable.h | 5 +++++ init/main.c | 6 ++---- 4 files changed, 12 insertions(+), 12 deletions(-) --- a/arch/x86/include/asm/espfix.h +++ b/arch/x86/include/asm/espfix.h @@ -2,7 +2,7 @@ #ifndef _ASM_X86_ESPFIX_H #define _ASM_X86_ESPFIX_H -#ifdef CONFIG_X86_64 +#ifdef CONFIG_X86_ESPFIX64 #include <asm/percpu.h> @@ -11,7 +11,8 @@ DECLARE_PER_CPU_READ_MOSTLY(unsigned lon extern void init_espfix_bsp(void); extern void init_espfix_ap(int cpu); - -#endif /* CONFIG_X86_64 */ +#else +static inline void init_espfix_ap(int cpu) { } +#endif #endif /* _ASM_X86_ESPFIX_H */ --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -990,12 +990,8 @@ static int do_boot_cpu(int apicid, int c initial_code = (unsigned long)start_secondary; initial_stack = idle->thread.sp; - /* - * Enable the espfix hack for this CPU - */ -#ifdef CONFIG_X86_ESPFIX64 + /* Enable the espfix hack for this CPU */ init_espfix_ap(cpu); -#endif /* So we see what's up */ announce_cpu(cpu, apicid); --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -1025,6 +1025,11 @@ static inline int pmd_clear_huge(pmd_t * struct file; int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn, unsigned long size, pgprot_t *vma_prot); + +#ifndef CONFIG_X86_ESPFIX64 +static inline void init_espfix_bsp(void) { } +#endif + #endif /* !__ASSEMBLY__ */ #ifndef io_remap_pfn_range --- a/init/main.c +++ b/init/main.c @@ -504,6 +504,8 @@ static void __init mm_init(void) pgtable_init(); vmalloc_init(); ioremap_huge_init(); + /* Should be run before the first non-init thread is created */ + init_espfix_bsp(); } asmlinkage __visible void __init start_kernel(void) @@ -674,10 +676,6 @@ asmlinkage __visible void __init start_k if (efi_enabled(EFI_RUNTIME_SERVICES)) efi_enter_virtual_mode(); #endif -#ifdef CONFIG_X86_ESPFIX64 - /* Should be run before the first non-init thread is created */ - init_espfix_bsp(); -#endif thread_stack_cache_init(); cred_init(); fork_init(); Patches currently in stable-queue which might be from tglx(a)linutronix.de are queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch queue-4.14/x86-mm-create-asm-invpcid.h.patch queue-4.14/x86-cpu_entry_area-prevent-wraparound-in-setup_cpu_entry_area_ptes-on-32bit.patch queue-4.14/x86-mm-remove-superfluous-barriers.patch queue-4.14/x86-ldt-rework-locking.patch queue-4.14/pci-pm-force-devices-to-d0-in-pci_pm_thaw_noirq.patch queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch queue-4.14/tools-headers-sync-objtool-uapi-header.patch queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch queue-4.14/objtool-fix-cross-build.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "drm/sun4i: Fix error path handling" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled drm/sun4i: Fix error path handling to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: drm-sun4i-fix-error-path-handling.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 92411f6d7f1afcc95e54295d40e96a75385212ec Mon Sep 17 00:00:00 2001 From: Maxime Ripard <maxime.ripard(a)free-electrons.com> Date: Thu, 7 Dec 2017 16:58:50 +0100 Subject: drm/sun4i: Fix error path handling From: Maxime Ripard <maxime.ripard(a)free-electrons.com> commit 92411f6d7f1afcc95e54295d40e96a75385212ec upstream. The commit 4c7f16d14a33 ("drm/sun4i: Fix TCON clock and regmap initialization sequence") moved a bunch of logic around, but forgot to update the gotos after the introduction of the err_free_dotclock label. It means that if we fail later that the one introduced in that commit, we'll just to the old label which isn't free the clock we created. This will result in a breakage as soon as someone tries to do something with that clock, since its resources will have been long reclaimed. Fixes: 4c7f16d14a33 ("drm/sun4i: Fix TCON clock and regmap initialization sequence") Reviewed-by: Chen-Yu Tsai <wens(a)csie.org> Signed-off-by: Maxime Ripard <maxime.ripard(a)free-electrons.com> Link: https://patchwork.freedesktop.org/patch/msgid/f83c1cebc731f0b4251f5ddd7b38c… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/gpu/drm/sun4i/sun4i_tcon.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/gpu/drm/sun4i/sun4i_tcon.c +++ b/drivers/gpu/drm/sun4i/sun4i_tcon.c @@ -567,12 +567,12 @@ static int sun4i_tcon_bind(struct device if (IS_ERR(tcon->crtc)) { dev_err(dev, "Couldn't create our CRTC\n"); ret = PTR_ERR(tcon->crtc); - goto err_free_clocks; + goto err_free_dotclock; } ret = sun4i_rgb_init(drm, tcon); if (ret < 0) - goto err_free_clocks; + goto err_free_dotclock; list_add_tail(&tcon->list, &drv->tcon_list); Patches currently in stable-queue which might be from maxime.ripard(a)free-electrons.com are queue-4.14/drm-sun4i-fix-error-path-handling.patch queue-4.14/clk-sunxi-sun9i-mmc-implement-reset-callback-for-reset-controls.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "crypto: skcipher - set walk.iv for zero-length inputs" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled crypto: skcipher - set walk.iv for zero-length inputs to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: crypto-skcipher-set-walk.iv-for-zero-length-inputs.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 2b4f27c36bcd46e820ddb9a8e6fe6a63fa4250b8 Mon Sep 17 00:00:00 2001 From: Eric Biggers <ebiggers(a)google.com> Date: Wed, 29 Nov 2017 01:18:57 -0800 Subject: crypto: skcipher - set walk.iv for zero-length inputs From: Eric Biggers <ebiggers(a)google.com> commit 2b4f27c36bcd46e820ddb9a8e6fe6a63fa4250b8 upstream. All the ChaCha20 algorithms as well as the ARM bit-sliced AES-XTS algorithms call skcipher_walk_virt(), then access the IV (walk.iv) before checking whether any bytes need to be processed (walk.nbytes). But if the input is empty, then skcipher_walk_virt() doesn't set the IV, and the algorithms crash trying to use the uninitialized IV pointer. Fix it by setting the IV earlier in skcipher_walk_virt(). Also fix it for the AEAD walk functions. This isn't a perfect solution because we can't actually align the IV to ->cra_alignmask unless there are bytes to process, for one because the temporary buffer for the aligned IV is freed by skcipher_walk_done(), which is only called when there are bytes to process. Thus, algorithms that require aligned IVs will still need to avoid accessing the IV when walk.nbytes == 0. Still, many algorithms/architectures are fine with IVs having any alignment, and even for those that aren't, a misaligned pointer bug is much less severe than an uninitialized pointer bug. This change also matches the behavior of the older blkcipher_walk API. Fixes: 0cabf2af6f5a ("crypto: skcipher - Fix crash on zero-length input") Reported-by: syzbot <syzkaller(a)googlegroups.com> Signed-off-by: Eric Biggers <ebiggers(a)google.com> Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- crypto/skcipher.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) --- a/crypto/skcipher.c +++ b/crypto/skcipher.c @@ -449,6 +449,8 @@ static int skcipher_walk_skcipher(struct walk->total = req->cryptlen; walk->nbytes = 0; + walk->iv = req->iv; + walk->oiv = req->iv; if (unlikely(!walk->total)) return 0; @@ -456,9 +458,6 @@ static int skcipher_walk_skcipher(struct scatterwalk_start(&walk->in, req->src); scatterwalk_start(&walk->out, req->dst); - walk->iv = req->iv; - walk->oiv = req->iv; - walk->flags &= ~SKCIPHER_WALK_SLEEP; walk->flags |= req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP ? SKCIPHER_WALK_SLEEP : 0; @@ -510,6 +509,8 @@ static int skcipher_walk_aead_common(str int err; walk->nbytes = 0; + walk->iv = req->iv; + walk->oiv = req->iv; if (unlikely(!walk->total)) return 0; @@ -525,9 +526,6 @@ static int skcipher_walk_aead_common(str scatterwalk_done(&walk->in, 0, walk->total); scatterwalk_done(&walk->out, 0, walk->total); - walk->iv = req->iv; - walk->oiv = req->iv; - if (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) walk->flags |= SKCIPHER_WALK_SLEEP; else Patches currently in stable-queue which might be from ebiggers(a)google.com are queue-4.14/crypto-skcipher-set-walk.iv-for-zero-length-inputs.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "drm/i915: Flush pending GTT writes before unbinding" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled drm/i915: Flush pending GTT writes before unbinding to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: drm-i915-flush-pending-gtt-writes-before-unbinding.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 2797c4a11f373b2545c2398ccb02e362ee66a142 Mon Sep 17 00:00:00 2001 From: Chris Wilson <chris(a)chris-wilson.co.uk> Date: Mon, 4 Dec 2017 13:25:13 +0000 Subject: drm/i915: Flush pending GTT writes before unbinding From: Chris Wilson <chris(a)chris-wilson.co.uk> commit 2797c4a11f373b2545c2398ccb02e362ee66a142 upstream. >From the shrinker paths, we want to relinquish the GPU and GGTT access to the object, releasing the backing storage back to the system for swapout. As a part of that process we would unpin the pages, marking them for access by the CPU (for the swapout/swapin). However, if that process was interrupted after unbind the vma, we missed a flush of the inflight GGTT writes before we made that GTT space available again for reuse, with the prospect that we would redirect them to another page. The bug dates back to the introduction of multiple GGTT vma, but the code itself dates to commit 02bef8f98d26 ("drm/i915: Unbind closed vma for i915_gem_object_unbind()"). Fixes: 02bef8f98d26 ("drm/i915: Unbind closed vma for i915_gem_object_unbind()") Fixes: c5ad54cf7dd8 ("drm/i915: Use partial view in mmap fault handler") Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171204132513.7303-1-chris@c… (cherry picked from commit 5888fc9eac3c2ff96e76aeeb865fdb46ab2d711e) Signed-off-by: Jani Nikula <jani.nikula(a)intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/gpu/drm/i915/i915_gem.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -325,17 +325,10 @@ int i915_gem_object_unbind(struct drm_i9 * must wait for all rendering to complete to the object (as unbinding * must anyway), and retire the requests. */ - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_LOCKED | - I915_WAIT_ALL, - MAX_SCHEDULE_TIMEOUT, - NULL); + ret = i915_gem_object_set_to_cpu_domain(obj, false); if (ret) return ret; - i915_gem_retire_requests(to_i915(obj->base.dev)); - while ((vma = list_first_entry_or_null(&obj->vma_list, struct i915_vma, obj_link))) { Patches currently in stable-queue which might be from chris(a)chris-wilson.co.uk are queue-4.14/drm-i915-flush-pending-gtt-writes-before-unbinding.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "crypto: mcryptd - protect the per-CPU queue with a lock" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled crypto: mcryptd - protect the per-CPU queue with a lock to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: crypto-mcryptd-protect-the-per-cpu-queue-with-a-lock.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 9abffc6f2efe46c3564c04312e52e07622d40e51 Mon Sep 17 00:00:00 2001 From: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de> Date: Thu, 30 Nov 2017 13:39:27 +0100 Subject: crypto: mcryptd - protect the per-CPU queue with a lock From: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de> commit 9abffc6f2efe46c3564c04312e52e07622d40e51 upstream. mcryptd_enqueue_request() grabs the per-CPU queue struct and protects access to it with disabled preemption. Then it schedules a worker on the same CPU. The worker in mcryptd_queue_worker() guards access to the same per-CPU variable with disabled preemption. If we take CPU-hotplug into account then it is possible that between queue_work_on() and the actual invocation of the worker the CPU goes down and the worker will be scheduled on _another_ CPU. And here the preempt_disable() protection does not work anymore. The easiest thing is to add a spin_lock() to guard access to the list. Another detail: mcryptd_queue_worker() is not processing more than MCRYPTD_BATCH invocation in a row. If there are still items left, then it will invoke queue_work() to proceed with more later. *I* would suggest to simply drop that check because it does not use a system workqueue and the workqueue is already marked as "CPU_INTENSIVE". And if preemption is required then the scheduler should do it. However if queue_work() is used then the work item is marked as CPU unbound. That means it will try to run on the local CPU but it may run on another CPU as well. Especially with CONFIG_DEBUG_WQ_FORCE_RR_CPU=y. Again, the preempt_disable() won't work here but lock which was introduced will help. In order to keep work-item on the local CPU (and avoid RR) I changed it to queue_work_on(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de> Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- crypto/mcryptd.c | 23 ++++++++++------------- include/crypto/mcryptd.h | 1 + 2 files changed, 11 insertions(+), 13 deletions(-) --- a/crypto/mcryptd.c +++ b/crypto/mcryptd.c @@ -81,6 +81,7 @@ static int mcryptd_init_queue(struct mcr pr_debug("cpu_queue #%d %p\n", cpu, queue->cpu_queue); crypto_init_queue(&cpu_queue->queue, max_cpu_qlen); INIT_WORK(&cpu_queue->work, mcryptd_queue_worker); + spin_lock_init(&cpu_queue->q_lock); } return 0; } @@ -104,15 +105,16 @@ static int mcryptd_enqueue_request(struc int cpu, err; struct mcryptd_cpu_queue *cpu_queue; - cpu = get_cpu(); - cpu_queue = this_cpu_ptr(queue->cpu_queue); - rctx->tag.cpu = cpu; + cpu_queue = raw_cpu_ptr(queue->cpu_queue); + spin_lock(&cpu_queue->q_lock); + cpu = smp_processor_id(); + rctx->tag.cpu = smp_processor_id(); err = crypto_enqueue_request(&cpu_queue->queue, request); pr_debug("enqueue request: cpu %d cpu_queue %p request %p\n", cpu, cpu_queue, request); + spin_unlock(&cpu_queue->q_lock); queue_work_on(cpu, kcrypto_wq, &cpu_queue->work); - put_cpu(); return err; } @@ -161,16 +163,11 @@ static void mcryptd_queue_worker(struct cpu_queue = container_of(work, struct mcryptd_cpu_queue, work); i = 0; while (i < MCRYPTD_BATCH || single_task_running()) { - /* - * preempt_disable/enable is used to prevent - * being preempted by mcryptd_enqueue_request() - */ - local_bh_disable(); - preempt_disable(); + + spin_lock_bh(&cpu_queue->q_lock); backlog = crypto_get_backlog(&cpu_queue->queue); req = crypto_dequeue_request(&cpu_queue->queue); - preempt_enable(); - local_bh_enable(); + spin_unlock_bh(&cpu_queue->q_lock); if (!req) { mcryptd_opportunistic_flush(); @@ -185,7 +182,7 @@ static void mcryptd_queue_worker(struct ++i; } if (cpu_queue->queue.qlen) - queue_work(kcrypto_wq, &cpu_queue->work); + queue_work_on(smp_processor_id(), kcrypto_wq, &cpu_queue->work); } void mcryptd_flusher(struct work_struct *__work) --- a/include/crypto/mcryptd.h +++ b/include/crypto/mcryptd.h @@ -27,6 +27,7 @@ static inline struct mcryptd_ahash *__mc struct mcryptd_cpu_queue { struct crypto_queue queue; + spinlock_t q_lock; struct work_struct work; }; Patches currently in stable-queue which might be from bigeasy(a)linutronix.de are queue-4.14/crypto-mcryptd-protect-the-per-cpu-queue-with-a-lock.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "crypto: af_alg - wait for data at beginning of recvmsg" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled crypto: af_alg - wait for data at beginning of recvmsg to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: crypto-af_alg-wait-for-data-at-beginning-of-recvmsg.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 11edb555966ed2c66c533d17c604f9d7e580a829 Mon Sep 17 00:00:00 2001 From: Stephan Mueller <smueller(a)chronox.de> Date: Wed, 29 Nov 2017 12:02:23 +0100 Subject: crypto: af_alg - wait for data at beginning of recvmsg From: Stephan Mueller <smueller(a)chronox.de> commit 11edb555966ed2c66c533d17c604f9d7e580a829 upstream. The wait for data is a non-atomic operation that can sleep and therefore potentially release the socket lock. The release of the socket lock allows another thread to modify the context data structure. The waiting operation for new data therefore must be called at the beginning of recvmsg. This prevents a race condition where checks of the members of the context data structure are performed by recvmsg while there is a potential for modification of these values. Fixes: e870456d8e7c ("crypto: algif_skcipher - overhaul memory management") Fixes: d887c52d6ae4 ("crypto: algif_aead - overhaul memory management") Reported-by: syzbot <syzkaller(a)googlegroups.com> Signed-off-by: Stephan Mueller <smueller(a)chronox.de> Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- crypto/af_alg.c | 6 ------ crypto/algif_aead.c | 6 ++++++ crypto/algif_skcipher.c | 6 ++++++ 3 files changed, 12 insertions(+), 6 deletions(-) --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -1165,12 +1165,6 @@ int af_alg_get_rsgl(struct sock *sk, str if (!af_alg_readable(sk)) break; - if (!ctx->used) { - err = af_alg_wait_for_data(sk, flags); - if (err) - return err; - } - seglen = min_t(size_t, (maxsize - len), msg_data_left(msg)); --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -111,6 +111,12 @@ static int _aead_recvmsg(struct socket * size_t usedpages = 0; /* [in] RX bufs to be used from user */ size_t processed = 0; /* [in] TX bufs to be consumed */ + if (!ctx->used) { + err = af_alg_wait_for_data(sk, flags); + if (err) + return err; + } + /* * Data length provided by caller via sendmsg/sendpage that has not * yet been processed. --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -72,6 +72,12 @@ static int _skcipher_recvmsg(struct sock int err = 0; size_t len = 0; + if (!ctx->used) { + err = af_alg_wait_for_data(sk, flags); + if (err) + return err; + } + /* Allocate cipher request for current operation. */ areq = af_alg_alloc_areq(sk, sizeof(struct af_alg_async_req) + crypto_skcipher_reqsize(tfm)); Patches currently in stable-queue which might be from smueller(a)chronox.de are queue-4.14/crypto-af_alg-wait-for-data-at-beginning-of-recvmsg.patch queue-4.14/crypto-af_alg-fix-race-accessing-cipher-request.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "crypto: af_alg - fix race accessing cipher request" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled crypto: af_alg - fix race accessing cipher request to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: crypto-af_alg-fix-race-accessing-cipher-request.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From d53c5135792319e095bb126bc43b2ee98586f7fe Mon Sep 17 00:00:00 2001 From: Stephan Mueller <smueller(a)chronox.de> Date: Fri, 8 Dec 2017 11:50:37 +0100 Subject: crypto: af_alg - fix race accessing cipher request From: Stephan Mueller <smueller(a)chronox.de> commit d53c5135792319e095bb126bc43b2ee98586f7fe upstream. When invoking an asynchronous cipher operation, the invocation of the callback may be performed before the subsequent operations in the initial code path are invoked. The callback deletes the cipher request data structure which implies that after the invocation of the asynchronous cipher operation, this data structure must not be accessed any more. The setting of the return code size with the request data structure must therefore be moved before the invocation of the asynchronous cipher operation. Fixes: e870456d8e7c ("crypto: algif_skcipher - overhaul memory management") Fixes: d887c52d6ae4 ("crypto: algif_aead - overhaul memory management") Reported-by: syzbot <syzkaller(a)googlegroups.com> Signed-off-by: Stephan Mueller <smueller(a)chronox.de> Acked-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com> Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- crypto/algif_aead.c | 10 +++++----- crypto/algif_skcipher.c | 10 +++++----- 2 files changed, 10 insertions(+), 10 deletions(-) --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -291,6 +291,10 @@ static int _aead_recvmsg(struct socket * /* AIO operation */ sock_hold(sk); areq->iocb = msg->msg_iocb; + + /* Remember output size that will be generated. */ + areq->outlen = outlen; + aead_request_set_callback(&areq->cra_u.aead_req, CRYPTO_TFM_REQ_MAY_BACKLOG, af_alg_async_cb, areq); @@ -298,12 +302,8 @@ static int _aead_recvmsg(struct socket * crypto_aead_decrypt(&areq->cra_u.aead_req); /* AIO operation in progress */ - if (err == -EINPROGRESS || err == -EBUSY) { - /* Remember output size that will be generated. */ - areq->outlen = outlen; - + if (err == -EINPROGRESS || err == -EBUSY) return -EIOCBQUEUED; - } sock_put(sk); } else { --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -125,6 +125,10 @@ static int _skcipher_recvmsg(struct sock /* AIO operation */ sock_hold(sk); areq->iocb = msg->msg_iocb; + + /* Remember output size that will be generated. */ + areq->outlen = len; + skcipher_request_set_callback(&areq->cra_u.skcipher_req, CRYPTO_TFM_REQ_MAY_SLEEP, af_alg_async_cb, areq); @@ -133,12 +137,8 @@ static int _skcipher_recvmsg(struct sock crypto_skcipher_decrypt(&areq->cra_u.skcipher_req); /* AIO operation in progress */ - if (err == -EINPROGRESS || err == -EBUSY) { - /* Remember output size that will be generated. */ - areq->outlen = len; - + if (err == -EINPROGRESS || err == -EBUSY) return -EIOCBQUEUED; - } sock_put(sk); } else { Patches currently in stable-queue which might be from smueller(a)chronox.de are queue-4.14/crypto-af_alg-wait-for-data-at-beginning-of-recvmsg.patch queue-4.14/crypto-af_alg-fix-race-accessing-cipher-request.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "block: unalign call_single_data in struct request" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled block: unalign call_single_data in struct request to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: block-unalign-call_single_data-in-struct-request.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 4ccafe032005e9b96acbef2e389a4de5b1254add Mon Sep 17 00:00:00 2001 From: Jens Axboe <axboe(a)kernel.dk> Date: Wed, 20 Dec 2017 13:13:58 -0700 Subject: block: unalign call_single_data in struct request From: Jens Axboe <axboe(a)kernel.dk> commit 4ccafe032005e9b96acbef2e389a4de5b1254add upstream. A previous change blindly added massive alignment to the call_single_data structure in struct request. This ballooned it in size from 296 to 320 bytes on my setup, for no valid reason at all. Use the unaligned struct __call_single_data variant instead. Fixes: 966a967116e69 ("smp: Avoid using two cache lines for struct call_single_data") Signed-off-by: Jens Axboe <axboe(a)kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- include/linux/blkdev.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -135,7 +135,7 @@ typedef __u32 __bitwise req_flags_t; struct request { struct list_head queuelist; union { - call_single_data_t csd; + struct __call_single_data csd; u64 fifo_time; }; Patches currently in stable-queue which might be from axboe(a)kernel.dk are queue-4.14/block-unalign-call_single_data-in-struct-request.patch queue-4.14/block-throttle-avoid-double-charge.patch

7 years, 8 months

1
0
0 0

[Linux-stable-mirror] Patch "clk: sunxi: sun9i-mmc: Implement reset callback for reset controls" has been added to the 4.14-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled clk: sunxi: sun9i-mmc: Implement reset callback for reset controls to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: clk-sunxi-sun9i-mmc-implement-reset-callback-for-reset-controls.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 61d2f2a05765a5f57149efbd93e3e81a83cbc2c1 Mon Sep 17 00:00:00 2001 From: Chen-Yu Tsai <wens(a)csie.org> Date: Mon, 18 Dec 2017 11:57:51 +0800 Subject: clk: sunxi: sun9i-mmc: Implement reset callback for reset controls From: Chen-Yu Tsai <wens(a)csie.org> commit 61d2f2a05765a5f57149efbd93e3e81a83cbc2c1 upstream. Our MMC host driver now issues a reset, instead of just deasserting the reset control, since commit c34eda69ad4c ("mmc: sunxi: Reset the device at probe time"). The sun9i-mmc clock driver does not support this, and will fail, which results in MMC not probing. This patch implements the reset callback by asserting the reset control, then deasserting it after a small delay. Fixes: 7a6fca879f59 ("clk: sunxi: Add driver for A80 MMC config clocks/resets") Signed-off-by: Chen-Yu Tsai <wens(a)csie.org> Acked-by: Philipp Zabel <p.zabel(a)pengutronix.de> Acked-by: Maxime Ripard <maxime.ripard(a)free-electrons.com> Signed-off-by: Michael Turquette <mturquette(a)baylibre.com> Link: lkml.kernel.org/r/20171218035751.20661-1-wens@csie.org Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/clk/sunxi/clk-sun9i-mmc.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) --- a/drivers/clk/sunxi/clk-sun9i-mmc.c +++ b/drivers/clk/sunxi/clk-sun9i-mmc.c @@ -16,6 +16,7 @@ #include <linux/clk.h> #include <linux/clk-provider.h> +#include <linux/delay.h> #include <linux/init.h> #include <linux/of.h> #include <linux/of_device.h> @@ -83,9 +84,20 @@ static int sun9i_mmc_reset_deassert(stru return 0; } +static int sun9i_mmc_reset_reset(struct reset_controller_dev *rcdev, + unsigned long id) +{ + sun9i_mmc_reset_assert(rcdev, id); + udelay(10); + sun9i_mmc_reset_deassert(rcdev, id); + + return 0; +} + static const struct reset_control_ops sun9i_mmc_reset_ops = { .assert = sun9i_mmc_reset_assert, .deassert = sun9i_mmc_reset_deassert, + .reset = sun9i_mmc_reset_reset, }; static int sun9i_a80_mmc_config_clk_probe(struct platform_device *pdev) Patches currently in stable-queue which might be from wens(a)csie.org are queue-4.14/drm-sun4i-fix-error-path-handling.patch queue-4.14/clk-sunxi-sun9i-mmc-implement-reset-callback-for-reset-controls.patch

7 years, 8 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror