Hi Greg,
On 05/27/2018 04:32 PM, gregkh(a)linuxfoundation.org wrote:
> This is a note to let you know that I've just added the patch titled
>
> drm/vmwgfx: Unpin the screen object backup buffer when not used
>
> to the 4.16-stable tree which can be found at:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.kernel.org_git_-3Fp…
I got an identical message on May 2nd that this patch was added to the
4.16 and 4.14 stable trees. Now there is a small but important fix to
this patch so I CC'd it to stable assuming that this patch was already
in those versions.
Fix is "drm/vmwgfx: Set dmabuf_size_when vmw_dmabuf_init is successful"
by Deepak Rawat.
It's correct to have both this patch and its fix in stable.
Thanks,
Thomas
Hi Greg,
9 more patches against the 2018/05/23 linux-4.4.y stable branch.
This gets the spectre defense of 4.4 up-to-date compared to the
current upstream tree. The upstream patches to remove the indirect
branches from the BPF JIT are included (these do not have a
CC:stable tag).
Martin Schwidefsky (9):
s390: add assembler macros for CPU alternatives
s390: move expoline assembler macros to a header
s390/lib: use expoline for indirect branches
s390/ftrace: use expoline for indirect branches
s390/kernel: use expoline for indirect branches
s390: move spectre sysfs attribute code
s390: remove indirect branch from do_softirq_own_stack
s390: extend expoline to BC instructions
s390: use expoline thunks in the BPF JIT
arch/s390/include/asm/alternative-asm.h | 108 ++++++++++++++++++
arch/s390/include/asm/nospec-insn.h | 193 ++++++++++++++++++++++++++++++++
arch/s390/kernel/Makefile | 1 +
arch/s390/kernel/asm-offsets.c | 1 +
arch/s390/kernel/base.S | 24 ++--
arch/s390/kernel/entry.S | 105 ++++-------------
arch/s390/kernel/irq.c | 5 +-
arch/s390/kernel/mcount.S | 14 ++-
arch/s390/kernel/nospec-branch.c | 43 ++++---
arch/s390/kernel/nospec-sysfs.c | 21 ++++
arch/s390/kernel/reipl.S | 5 +-
arch/s390/kernel/swsusp.S | 10 +-
arch/s390/lib/mem.S | 9 +-
arch/s390/net/bpf_jit.S | 16 ++-
arch/s390/net/bpf_jit_comp.c | 63 ++++++++++-
15 files changed, 480 insertions(+), 138 deletions(-)
create mode 100644 arch/s390/include/asm/alternative-asm.h
create mode 100644 arch/s390/include/asm/nospec-insn.h
create mode 100644 arch/s390/kernel/nospec-sysfs.c
--
2.16.3
Hi Greg,
Please queue up this series of patches for 4.14 if you have no objections.
cheers
v2: Fixed up upstream commit markings.
Mauricio Faria de Oliveira (4):
powerpc/rfi-flush: Differentiate enabled and patched flush types
powerpc/pseries: Fix clearing of security feature flags
powerpc: Move default security feature flags
powerpc/pseries: Restore default security feature flags on setup
Michael Ellerman (17):
powerpc/pseries: Support firmware disable of RFI flush
powerpc/powernv: Support firmware disable of RFI flush
powerpc/rfi-flush: Move the logic to avoid a redo into the debugfs
code
powerpc/rfi-flush: Make it possible to call setup_rfi_flush() again
powerpc/rfi-flush: Always enable fallback flush on pseries
powerpc/rfi-flush: Call setup_rfi_flush() after LPM migration
powerpc/pseries: Add new H_GET_CPU_CHARACTERISTICS flags
powerpc: Add security feature flags for Spectre/Meltdown
powerpc/pseries: Set or clear security feature flags
powerpc/powernv: Set or clear security feature flags
powerpc/64s: Move cpu_show_meltdown()
powerpc/64s: Enhance the information in cpu_show_meltdown()
powerpc/powernv: Use the security flags in pnv_setup_rfi_flush()
powerpc/pseries: Use the security flags in pseries_setup_rfi_flush()
powerpc/64s: Wire up cpu_show_spectre_v1()
powerpc/64s: Wire up cpu_show_spectre_v2()
powerpc/64s: Fix section mismatch warnings from setup_rfi_flush()
Nicholas Piggin (2):
powerpc/64s: Improve RFI L1-D cache flush fallback
powerpc/64s: Add support for a store forwarding barrier at kernel
entry/exit
arch/powerpc/include/asm/exception-64s.h | 29 ++++
arch/powerpc/include/asm/feature-fixups.h | 19 +++
arch/powerpc/include/asm/hvcall.h | 3 +
arch/powerpc/include/asm/paca.h | 3 +-
arch/powerpc/include/asm/security_features.h | 85 ++++++++++
arch/powerpc/include/asm/setup.h | 2 +-
arch/powerpc/kernel/Makefile | 2 +-
arch/powerpc/kernel/asm-offsets.c | 3 +-
arch/powerpc/kernel/exceptions-64s.S | 95 ++++++-----
arch/powerpc/kernel/security.c | 237 +++++++++++++++++++++++++++
arch/powerpc/kernel/setup_64.c | 48 ++----
arch/powerpc/kernel/vmlinux.lds.S | 14 ++
arch/powerpc/lib/feature-fixups.c | 124 +++++++++++++-
arch/powerpc/platforms/powernv/setup.c | 92 ++++++++---
arch/powerpc/platforms/pseries/mobility.c | 3 +
arch/powerpc/platforms/pseries/pseries.h | 2 +
arch/powerpc/platforms/pseries/setup.c | 81 +++++++--
arch/powerpc/xmon/xmon.c | 2 +
18 files changed, 721 insertions(+), 123 deletions(-)
create mode 100644 arch/powerpc/include/asm/security_features.h
create mode 100644 arch/powerpc/kernel/security.c
--
2.14.1
From: Changwei Ge <ge.changwei(a)h3c.com>
Somehow, file system metadata was corrupted, which causes
ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el().
According to the original design intention, if above happens we should
skip the problematic block and continue to retrieve dir entry. But there
is obviouse misuse of brelse around related code.
After failure of ocfs2_check_dir_entry(), currunt code just moves to next
position and uses the problematic buffer head again and again during
which the problematic buffer head is released for multiple times. I
suppose, this a serious issue which is long-lived in ocfs2. This may
cause other file systems which is also used in a the same host insane.
So we should also consider about bakcporting this patch into
linux -stable.
Suggested-by: Changkuo Shi <shi.changkuo(a)h3c.com>
Signed-off-by: Changwei Ge <ge.changwei(a)h3c.com>
---
fs/ocfs2/dir.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/ocfs2/dir.c b/fs/ocfs2/dir.c
index b048d4f..c121abb 100644
--- a/fs/ocfs2/dir.c
+++ b/fs/ocfs2/dir.c
@@ -1897,8 +1897,7 @@ static int ocfs2_dir_foreach_blk_el(struct inode *inode,
/* On error, skip the f_pos to the
next block. */
ctx->pos = (ctx->pos | (sb->s_blocksize - 1)) + 1;
- brelse(bh);
- continue;
+ break;
}
if (le64_to_cpu(de->inode)) {
unsigned char d_type = DT_UNKNOWN;
--
2.7.4
From: Changwei Ge <ge.changwei(a)h3c.com>
Somehow, file system metadata was corrupted, which causes
ocfs2_check_dir_entry() fail in function ocfs2_dir_foreach_blk_el().
According to the original design intention, if above happens we should
skip the problematic block and continue to retrieve dir entry. But there
is obviouse misuse of brelse around related code.
After failure of ocfs2_check_dir_entry(), currunt code just moves to next
position and uses the problematic buffer head again and again during
which the problematic buffer head is released for multiple times. I
suppose, this a serious issue which is long-lived in ocfs2. This may
cause other file systems which is also used in a the same host insane.
So we should also consider about bakcporting this patch into
linux -stable.
Suggested-by: Changkuo Shi <shi.changkuo(a)h3c.com>
Signed-off-by: Changwei Ge <ge.changwei(a)h3c.com>
---
fs/ocfs2/dir.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/ocfs2/dir.c b/fs/ocfs2/dir.c
index b048d4f..c121abb 100644
--- a/fs/ocfs2/dir.c
+++ b/fs/ocfs2/dir.c
@@ -1897,8 +1897,7 @@ static int ocfs2_dir_foreach_blk_el(struct inode *inode,
/* On error, skip the f_pos to the
next block. */
ctx->pos = (ctx->pos | (sb->s_blocksize - 1)) + 1;
- brelse(bh);
- continue;
+ break;
}
if (le64_to_cpu(de->inode)) {
unsigned char d_type = DT_UNKNOWN;
--
2.7.4
From: Changwei Ge <ge.changwei(a)h3c.com>
Somehow, file system metadata was corrupted, which cause
ocfs2_check_dir_entry() fail in function ocfs2_dir_foreach_blk_el().
According to the original design intention, if above happens we should
skip the problematic block and continue to retrieve dir entry. But there
is obviouse misuse of brelse around related code.
After failure of ocfs2_check_dir_entry(), currunt code just move to next
position and use the problematic buffer head again and again during
which the problematic buffer head is released for multiple times. I
suppose, this a serious issue which is long-lived in ocfs2. This may
cause other file systems which is also used in a the same host insane.
So we should also consider about bakcporting this patch into
linux -stable.
Suggested-by: Changkuo Shi <shi.changkuo(a)h3c.com>
Signed-off-by: Changwei Ge <ge.changwei(a)h3c.com>
---
fs/ocfs2/dir.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/ocfs2/dir.c b/fs/ocfs2/dir.c
index b048d4f..c121abb 100644
--- a/fs/ocfs2/dir.c
+++ b/fs/ocfs2/dir.c
@@ -1897,8 +1897,7 @@ static int ocfs2_dir_foreach_blk_el(struct inode *inode,
/* On error, skip the f_pos to the
next block. */
ctx->pos = (ctx->pos | (sb->s_blocksize - 1)) + 1;
- brelse(bh);
- continue;
+ break;
}
if (le64_to_cpu(de->inode)) {
unsigned char d_type = DT_UNKNOWN;
--
2.7.4
From: Changwei Ge <ge.changwei(a)h3c.com>
From: Changwei Ge <gechangwei(a)live.cn>
Somehow, file system metadata was corrupted, which cause
ocfs2_check_dir_entry() fail in function ocfs2_dir_foreach_blk_el().
According to the original design intention, if above happens we should
skip the problematic block and continue to retrieve dir entry. But there
is obviouse misuse of brelse around related code.
After failure of ocfs2_check_dir_entry(), currunt code just move to next
position and use the problematic buffer head again and again during
which the problematic buffer head is released for multiple times. I
suppose, this a serious issue which is long-lived in ocfs2. This may
cause other file systems which is also used in a the same host insane.
So we should also consider about bakcporting this patch into
linux -stable.
Suggested-by: Changkuo Shi <shi.changkuo(a)h3c.com>
Signed-off-by: Changwei Ge <ge.changwei(a)h3c.com>
---
fs/ocfs2/dir.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/ocfs2/dir.c b/fs/ocfs2/dir.c
index b048d4f..c121abb 100644
--- a/fs/ocfs2/dir.c
+++ b/fs/ocfs2/dir.c
@@ -1897,8 +1897,7 @@ static int ocfs2_dir_foreach_blk_el(struct inode *inode,
/* On error, skip the f_pos to the
next block. */
ctx->pos = (ctx->pos | (sb->s_blocksize - 1)) + 1;
- brelse(bh);
- continue;
+ break;
}
if (le64_to_cpu(de->inode)) {
unsigned char d_type = DT_UNKNOWN;
--
2.7.4
Commit 9d15cd958c17 ("media: uvcvideo: Convert from using an atomic
variable to a reference count")
didn't take into account that while the old counter was initialized to
0 (no stream open), kref_init starts with a reference of 1. The
reference count on unplug no longer reaches 0, uvc_delete isn't
called, and evdev doesn't release /dev/input/event*. Plug and unplug
enough times and it runs out of device minors preventing any new input
device and the use of newly plugged in USB video cameras until the
system is rebooted.
Signed-off-by: David Fries <David(a)Fries.net>
Cc: Guennadi Liakhovetski <g.liakhovetski(a)gmx.de>
Cc: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
Cc: Mauro Carvalho Chehab <mchehab(a)s-opensource.com>
Cc: stable(a)vger.kernel.org
---
drivers/media/usb/uvc/uvc_driver.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
index 2469b49..3cbdc87 100644
--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1871,13 +1871,6 @@ static void uvc_unregister_video(struct uvc_device *dev)
{
struct uvc_streaming *stream;
- /* Unregistering all video devices might result in uvc_delete() being
- * called from inside the loop if there's no open file handle. To avoid
- * that, increment the refcount before iterating over the streams and
- * decrement it when done.
- */
- kref_get(&dev->ref);
-
list_for_each_entry(stream, &dev->streams, list) {
if (!video_is_registered(&stream->vdev))
continue;
@@ -1888,6 +1881,10 @@ static void uvc_unregister_video(struct uvc_device *dev)
uvc_debugfs_cleanup_stream(stream);
}
+ /* Release the reference implicit in kref_init from uvc_probe,
+ * the UVC device won't be deleted until the last file descriptor
+ * is also closed.
+ */
kref_put(&dev->ref, uvc_delete);
}
--
2.1.4
>From 32a612bc06a2a1b9215f3b7166342c98043bd925 Mon Sep 17 00:00:00 2001
From: David Fries <David(a)Fries.net>
Date: Thu, 24 May 2018 23:43:15 -0500
Subject: [PATCH] uvc_driver: UVC kref never reaches zero leading to denial of
service
Commit 9d15cd958c17 ("media: uvcvideo: Convert from using an atomic
variable to a reference count")
didn't take into account that while the counter was
initialized to 0 (no stream open), kref_init starts with a reference
of 1, leading to the device
never reaching 0 and uvc_delete never getting called. This leads to
evdev never getting released and /dev/input/event* eventually running
out of minors preventing any new event devices and new USB cameras from
being usable until the system is rebooted.
In my case "disabled by hub (EMI?), re-enabling..." kept
removing/inserting the device until days later it ran out of minors
and I lost the video security feed.
Now that the device is actually being removed other problems are
showing up. Specifically the following if the camera is removed or
`rmmod ehci_pci` while an application is getting video from it. It
doesn't happen if the camera is not in use. How do I track that down?
sysfs group 'power' not found for kobject 'event10'
sysfs group 'power' not found for kobject 'input32'
sysfs group 'id' not found for kobject 'input32'
sysfs group 'capabilities' not found for kobject 'input32'
sysfs group 'power' not found for kobject 'media0'
Signed-off-by: David Fries <David(a)Fries.net>
---
drivers/media/usb/uvc/uvc_driver.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
index 2469b49..3cbdc87 100644
--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1871,13 +1871,6 @@ static void uvc_unregister_video(struct uvc_device *dev)
{
struct uvc_streaming *stream;
- /* Unregistering all video devices might result in uvc_delete() being
- * called from inside the loop if there's no open file handle. To avoid
- * that, increment the refcount before iterating over the streams and
- * decrement it when done.
- */
- kref_get(&dev->ref);
-
list_for_each_entry(stream, &dev->streams, list) {
if (!video_is_registered(&stream->vdev))
continue;
@@ -1888,6 +1881,10 @@ static void uvc_unregister_video(struct uvc_device *dev)
uvc_debugfs_cleanup_stream(stream);
}
+ /* Release the reference implicit in kref_init from uvc_probe,
+ * the UVC device won't be deleted until the last file descriptor
+ * is also closed.
+ */
kref_put(&dev->ref, uvc_delete);
}
--
2.1.4
The patch in the following e-mail fixes a reference count bug, it
seems to me that uvc_unregister_video is a good location to release
the final reference, I find it is called once. It may sound like a
lot to plug and unplug the USB camera 250 some times, but in my case
"disabled by hub (EMI?), re-enabling..." kept unplugging and plugging
in the device until days later it ran out of minors and I lost the
video security feed.
With this patch, now that the device is actually being removed other
problems are showing up. Specifically the following if the camera is
removed or `rmmod ehci_pci` while an application is getting video from
it. It doesn't happen if the camera is not in use. How do I track
that down?
sysfs group 'power' not found for kobject 'event10'
sysfs group 'power' not found for kobject 'input32'
sysfs group 'id' not found for kobject 'input32'
sysfs group 'capabilities' not found for kobject 'input32'
sysfs group 'power' not found for kobject 'media0'
--
David Fries <david(a)fries.net>
Subject of the patch: xfs: remove racy hasattr check from attr ops
Commit ID: 5a93790d4e2df73e30c965ec6e49be82fc3ccfce
Why: It didn't pass LTP getxattr04 test, which is "a regression test for the race between getting an existing xattr and setting/removing a large xattr. This bug leads to that getxattr() fails to get an existing xattr and returns ENOATTR in xfs filesystem."
LTP test getxattr04 was FAILing with this error message:
tst_device.c:230: INFO: Using test device LTP_DEV='/dev/loop0'
tst_mkfs.c:83: INFO: Formatting /dev/loop0 with xfs opts='' extra opts=''
tst_test.c:982: INFO: Timeout per run is 0h 05m 00s
getxattr04.c:72: FAIL: getxattr() failed to get an existing attribute
After patching 4.4.y and running the test again (on x86_64) it PASSes:
tst_device.c:230: INFO: Using test device LTP_DEV='/dev/loop0'
tst_mkfs.c:83: INFO: Formatting /dev/loop0 with xfs opts='' extra opts=''
tst_test.c:982: INFO: Timeout per run is 0h 05m 00s
getxattr04.c:82: PASS: getxattr() succeeded to get an existing attribute
What kernel version: 4.4.y (Note: 4.9.y already has it applied)
Thanks,
Daniel Sangorrin
Hi Greg,
Please apply commit 4ea77014af0d620 ("kernel/signal.c: avoid undefined behaviour
in kill_something_info") to v4.9.y and earlier to fix CVE-2018-10124.
Thanks,
Guenter
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8e907ed4882714fd13cfe670681fc6cb5284c780 Mon Sep 17 00:00:00 2001
From: Lidong Chen <jemmy858585(a)gmail.com>
Date: Tue, 8 May 2018 16:50:16 +0800
Subject: [PATCH] IB/umem: Use the correct mm during ib_umem_release
User-space may invoke ibv_reg_mr and ibv_dereg_mr in different threads.
If ibv_dereg_mr is called after the thread which invoked ibv_reg_mr has
exited, get_pid_task will return NULL and ib_umem_release will not
decrease mm->pinned_vm.
Instead of using threads to locate the mm, use the overall tgid from the
ib_ucontext struct instead. This matches the behavior of ODP and
disassociate in handling the mm of the process that called ibv_reg_mr.
Cc: <stable(a)vger.kernel.org>
Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get")
Signed-off-by: Lidong Chen <lidongchen(a)tencent.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 9a4e899d94b3..2b6c9b516070 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -119,7 +119,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
umem->length = size;
umem->address = addr;
umem->page_shift = PAGE_SHIFT;
- umem->pid = get_task_pid(current, PIDTYPE_PID);
/*
* We ask for writable memory if any of the following
* access flags are set. "Local write" and "remote write"
@@ -132,7 +131,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
if (access & IB_ACCESS_ON_DEMAND) {
- put_pid(umem->pid);
ret = ib_umem_odp_get(context, umem, access);
if (ret) {
kfree(umem);
@@ -148,7 +146,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
- put_pid(umem->pid);
kfree(umem);
return ERR_PTR(-ENOMEM);
}
@@ -231,7 +228,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
if (ret < 0) {
if (need_release)
__ib_umem_release(context->device, umem, 0);
- put_pid(umem->pid);
kfree(umem);
} else
current->mm->pinned_vm = locked;
@@ -274,8 +270,7 @@ void ib_umem_release(struct ib_umem *umem)
__ib_umem_release(umem->context->device, umem, 1);
- task = get_pid_task(umem->pid, PIDTYPE_PID);
- put_pid(umem->pid);
+ task = get_pid_task(umem->context->tgid, PIDTYPE_PID);
if (!task)
goto out;
mm = get_task_mm(task);
diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h
index 23159dd5be18..a1fd63871d17 100644
--- a/include/rdma/ib_umem.h
+++ b/include/rdma/ib_umem.h
@@ -48,7 +48,6 @@ struct ib_umem {
int writable;
int hugetlb;
struct work_struct work;
- struct pid *pid;
struct mm_struct *mm;
unsigned long diff;
struct ib_umem_odp *odp_data;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8e907ed4882714fd13cfe670681fc6cb5284c780 Mon Sep 17 00:00:00 2001
From: Lidong Chen <jemmy858585(a)gmail.com>
Date: Tue, 8 May 2018 16:50:16 +0800
Subject: [PATCH] IB/umem: Use the correct mm during ib_umem_release
User-space may invoke ibv_reg_mr and ibv_dereg_mr in different threads.
If ibv_dereg_mr is called after the thread which invoked ibv_reg_mr has
exited, get_pid_task will return NULL and ib_umem_release will not
decrease mm->pinned_vm.
Instead of using threads to locate the mm, use the overall tgid from the
ib_ucontext struct instead. This matches the behavior of ODP and
disassociate in handling the mm of the process that called ibv_reg_mr.
Cc: <stable(a)vger.kernel.org>
Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get")
Signed-off-by: Lidong Chen <lidongchen(a)tencent.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 9a4e899d94b3..2b6c9b516070 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -119,7 +119,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
umem->length = size;
umem->address = addr;
umem->page_shift = PAGE_SHIFT;
- umem->pid = get_task_pid(current, PIDTYPE_PID);
/*
* We ask for writable memory if any of the following
* access flags are set. "Local write" and "remote write"
@@ -132,7 +131,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
if (access & IB_ACCESS_ON_DEMAND) {
- put_pid(umem->pid);
ret = ib_umem_odp_get(context, umem, access);
if (ret) {
kfree(umem);
@@ -148,7 +146,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
- put_pid(umem->pid);
kfree(umem);
return ERR_PTR(-ENOMEM);
}
@@ -231,7 +228,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
if (ret < 0) {
if (need_release)
__ib_umem_release(context->device, umem, 0);
- put_pid(umem->pid);
kfree(umem);
} else
current->mm->pinned_vm = locked;
@@ -274,8 +270,7 @@ void ib_umem_release(struct ib_umem *umem)
__ib_umem_release(umem->context->device, umem, 1);
- task = get_pid_task(umem->pid, PIDTYPE_PID);
- put_pid(umem->pid);
+ task = get_pid_task(umem->context->tgid, PIDTYPE_PID);
if (!task)
goto out;
mm = get_task_mm(task);
diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h
index 23159dd5be18..a1fd63871d17 100644
--- a/include/rdma/ib_umem.h
+++ b/include/rdma/ib_umem.h
@@ -48,7 +48,6 @@ struct ib_umem {
int writable;
int hugetlb;
struct work_struct work;
- struct pid *pid;
struct mm_struct *mm;
unsigned long diff;
struct ib_umem_odp *odp_data;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From faf37c44a105f3608115785f17cbbf3500f8bc71 Mon Sep 17 00:00:00 2001
From: Michael Neuling <mikey(a)neuling.org>
Date: Fri, 18 May 2018 11:37:42 +1000
Subject: [PATCH] powerpc/64s: Clear PCR on boot
Clear the PCR (Processor Compatibility Register) on boot to ensure we
are not running in a compatibility mode.
We've seen this cause problems when a crash (and kdump) occurs while
running compat mode guests. The kdump kernel then runs with the PCR
set and causes problems. The symptom in the kdump kernel (also seen in
petitboot after fast-reboot) is early userspace programs taking
sigills on newer instructions (seen in libc).
Signed-off-by: Michael Neuling <mikey(a)neuling.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S
index 3f30c994e931..458b928dbd84 100644
--- a/arch/powerpc/kernel/cpu_setup_power.S
+++ b/arch/powerpc/kernel/cpu_setup_power.S
@@ -28,6 +28,7 @@ _GLOBAL(__setup_cpu_power7)
beqlr
li r0,0
mtspr SPRN_LPID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
li r4,(LPCR_LPES1 >> LPCR_LPES_SH)
bl __init_LPCR_ISA206
@@ -41,6 +42,7 @@ _GLOBAL(__restore_cpu_power7)
beqlr
li r0,0
mtspr SPRN_LPID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
li r4,(LPCR_LPES1 >> LPCR_LPES_SH)
bl __init_LPCR_ISA206
@@ -57,6 +59,7 @@ _GLOBAL(__setup_cpu_power8)
beqlr
li r0,0
mtspr SPRN_LPID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
ori r3, r3, LPCR_PECEDH
li r4,0 /* LPES = 0 */
@@ -78,6 +81,7 @@ _GLOBAL(__restore_cpu_power8)
beqlr
li r0,0
mtspr SPRN_LPID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
ori r3, r3, LPCR_PECEDH
li r4,0 /* LPES = 0 */
@@ -99,6 +103,7 @@ _GLOBAL(__setup_cpu_power9)
mtspr SPRN_PSSCR,r0
mtspr SPRN_LPID,r0
mtspr SPRN_PID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE | LPCR_HEIC)
or r3, r3, r4
@@ -123,6 +128,7 @@ _GLOBAL(__restore_cpu_power9)
mtspr SPRN_PSSCR,r0
mtspr SPRN_LPID,r0
mtspr SPRN_PID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE | LPCR_HEIC)
or r3, r3, r4
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 8ab51f6ca03a..c904477abaf3 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -101,6 +101,7 @@ static void __restore_cpu_cpufeatures(void)
if (hv_mode) {
mtspr(SPRN_LPID, 0);
mtspr(SPRN_HFSCR, system_registers.hfscr);
+ mtspr(SPRN_PCR, 0);
}
mtspr(SPRN_FSCR, system_registers.fscr);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From faf37c44a105f3608115785f17cbbf3500f8bc71 Mon Sep 17 00:00:00 2001
From: Michael Neuling <mikey(a)neuling.org>
Date: Fri, 18 May 2018 11:37:42 +1000
Subject: [PATCH] powerpc/64s: Clear PCR on boot
Clear the PCR (Processor Compatibility Register) on boot to ensure we
are not running in a compatibility mode.
We've seen this cause problems when a crash (and kdump) occurs while
running compat mode guests. The kdump kernel then runs with the PCR
set and causes problems. The symptom in the kdump kernel (also seen in
petitboot after fast-reboot) is early userspace programs taking
sigills on newer instructions (seen in libc).
Signed-off-by: Michael Neuling <mikey(a)neuling.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S
index 3f30c994e931..458b928dbd84 100644
--- a/arch/powerpc/kernel/cpu_setup_power.S
+++ b/arch/powerpc/kernel/cpu_setup_power.S
@@ -28,6 +28,7 @@ _GLOBAL(__setup_cpu_power7)
beqlr
li r0,0
mtspr SPRN_LPID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
li r4,(LPCR_LPES1 >> LPCR_LPES_SH)
bl __init_LPCR_ISA206
@@ -41,6 +42,7 @@ _GLOBAL(__restore_cpu_power7)
beqlr
li r0,0
mtspr SPRN_LPID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
li r4,(LPCR_LPES1 >> LPCR_LPES_SH)
bl __init_LPCR_ISA206
@@ -57,6 +59,7 @@ _GLOBAL(__setup_cpu_power8)
beqlr
li r0,0
mtspr SPRN_LPID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
ori r3, r3, LPCR_PECEDH
li r4,0 /* LPES = 0 */
@@ -78,6 +81,7 @@ _GLOBAL(__restore_cpu_power8)
beqlr
li r0,0
mtspr SPRN_LPID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
ori r3, r3, LPCR_PECEDH
li r4,0 /* LPES = 0 */
@@ -99,6 +103,7 @@ _GLOBAL(__setup_cpu_power9)
mtspr SPRN_PSSCR,r0
mtspr SPRN_LPID,r0
mtspr SPRN_PID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE | LPCR_HEIC)
or r3, r3, r4
@@ -123,6 +128,7 @@ _GLOBAL(__restore_cpu_power9)
mtspr SPRN_PSSCR,r0
mtspr SPRN_LPID,r0
mtspr SPRN_PID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE | LPCR_HEIC)
or r3, r3, r4
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 8ab51f6ca03a..c904477abaf3 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -101,6 +101,7 @@ static void __restore_cpu_cpufeatures(void)
if (hv_mode) {
mtspr(SPRN_LPID, 0);
mtspr(SPRN_HFSCR, system_registers.hfscr);
+ mtspr(SPRN_PCR, 0);
}
mtspr(SPRN_FSCR, system_registers.fscr);
The BAM has 3 channels - tx, rx and command. command channel
is used for register read/writes, tx channel for data writes
and rx channel for data reads. Currently, the driver assumes the
transfer completion once it gets all the command descriptors
completed. Sometimes, there is race condition between data channel
(tx/rx) and command channel completion. In these cases,
the data present in buffer is not valid during small window
between command descriptor completion and data descriptor
completion.
This patch generates NAND transfer completion when both
(Data and Command) DMA channels have completed all its DMA
descriptors. It assigns completion callback in last
DMA descriptors of that channel and wait for completion.
Fixes: 8d6b6d7e135e ("mtd: nand: qcom: support for command descriptor formation")
Cc: stable(a)vger.kernel.org
Signed-off-by: Abhishek Sahu <absahu(a)codeaurora.org>
---
* Changes from v2:
1. Changed commit message and comments slightly
2. Renamed wait_second_completion from first_chan_done and set
it before submit desc
3. Mark for stable tree
* Changes from v1:
NONE
drivers/mtd/nand/raw/qcom_nandc.c | 53 ++++++++++++++++++++++++++++++++++++++-
1 file changed, 52 insertions(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/raw/qcom_nandc.c b/drivers/mtd/nand/raw/qcom_nandc.c
index 7377923..7f85ef8 100644
--- a/drivers/mtd/nand/raw/qcom_nandc.c
+++ b/drivers/mtd/nand/raw/qcom_nandc.c
@@ -213,6 +213,8 @@
#define QPIC_PER_CW_CMD_SGL 32
#define QPIC_PER_CW_DATA_SGL 8
+#define QPIC_NAND_COMPLETION_TIMEOUT msecs_to_jiffies(2000)
+
/*
* Flags used in DMA descriptor preparation helper functions
* (i.e. read_reg_dma/write_reg_dma/read_data_dma/write_data_dma)
@@ -245,6 +247,11 @@
* @tx_sgl_start - start index in data sgl for tx.
* @rx_sgl_pos - current index in data sgl for rx.
* @rx_sgl_start - start index in data sgl for rx.
+ * @wait_second_completion - wait for second DMA desc completion before making
+ * the NAND transfer completion.
+ * @txn_done - completion for NAND transfer.
+ * @last_data_desc - last DMA desc in data channel (tx/rx).
+ * @last_cmd_desc - last DMA desc in command channel.
*/
struct bam_transaction {
struct bam_cmd_element *bam_ce;
@@ -258,6 +265,10 @@ struct bam_transaction {
u32 tx_sgl_start;
u32 rx_sgl_pos;
u32 rx_sgl_start;
+ bool wait_second_completion;
+ struct completion txn_done;
+ struct dma_async_tx_descriptor *last_data_desc;
+ struct dma_async_tx_descriptor *last_cmd_desc;
};
/*
@@ -504,6 +515,8 @@ static void free_bam_transaction(struct qcom_nand_controller *nandc)
bam_txn->data_sgl = bam_txn_buf;
+ init_completion(&bam_txn->txn_done);
+
return bam_txn;
}
@@ -523,11 +536,33 @@ static void clear_bam_transaction(struct qcom_nand_controller *nandc)
bam_txn->tx_sgl_start = 0;
bam_txn->rx_sgl_pos = 0;
bam_txn->rx_sgl_start = 0;
+ bam_txn->last_data_desc = NULL;
+ bam_txn->wait_second_completion = false;
sg_init_table(bam_txn->cmd_sgl, nandc->max_cwperpage *
QPIC_PER_CW_CMD_SGL);
sg_init_table(bam_txn->data_sgl, nandc->max_cwperpage *
QPIC_PER_CW_DATA_SGL);
+
+ reinit_completion(&bam_txn->txn_done);
+}
+
+/* Callback for DMA descriptor completion */
+static void qpic_bam_dma_done(void *data)
+{
+ struct bam_transaction *bam_txn = data;
+
+ /*
+ * In case of data transfer with NAND, 2 callbacks will be generated.
+ * One for command channel and another one for data channel.
+ * If current transaction has data descriptors
+ * (i.e. wait_second_completion is true), then set this to false
+ * and wait for second DMA descriptor completion.
+ */
+ if (bam_txn->wait_second_completion)
+ bam_txn->wait_second_completion = false;
+ else
+ complete(&bam_txn->txn_done);
}
static inline struct qcom_nand_host *to_qcom_nand_host(struct nand_chip *chip)
@@ -756,6 +791,12 @@ static int prepare_bam_async_desc(struct qcom_nand_controller *nandc,
desc->dma_desc = dma_desc;
+ /* update last data/command descriptor */
+ if (chan == nandc->cmd_chan)
+ bam_txn->last_cmd_desc = dma_desc;
+ else
+ bam_txn->last_data_desc = dma_desc;
+
list_add_tail(&desc->node, &nandc->desc_list);
return 0;
@@ -1273,10 +1314,20 @@ static int submit_descs(struct qcom_nand_controller *nandc)
cookie = dmaengine_submit(desc->dma_desc);
if (nandc->props->is_bam) {
+ bam_txn->last_cmd_desc->callback = qpic_bam_dma_done;
+ bam_txn->last_cmd_desc->callback_param = bam_txn;
+ if (bam_txn->last_data_desc) {
+ bam_txn->last_data_desc->callback = qpic_bam_dma_done;
+ bam_txn->last_data_desc->callback_param = bam_txn;
+ bam_txn->wait_second_completion = true;
+ }
+
dma_async_issue_pending(nandc->tx_chan);
dma_async_issue_pending(nandc->rx_chan);
+ dma_async_issue_pending(nandc->cmd_chan);
- if (dma_sync_wait(nandc->cmd_chan, cookie) != DMA_COMPLETE)
+ if (!wait_for_completion_timeout(&bam_txn->txn_done,
+ QPIC_NAND_COMPLETION_TIMEOUT))
return -ETIMEDOUT;
} else {
if (dma_sync_wait(nandc->chan, cookie) != DMA_COMPLETE)
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc.
is a member of Code Aurora Forum, hosted by The Linux Foundation
Hi Greg,
Please queue up this series of patches for 4.14 if you have no objections.
cheers
Mauricio Faria de Oliveira (4):
powerpc/rfi-flush: Differentiate enabled and patched flush types
powerpc/pseries: Fix clearing of security feature flags
powerpc: Move default security feature flags
powerpc/pseries: Restore default security feature flags on setup
Michael Ellerman (17):
powerpc/pseries: Support firmware disable of RFI flush
powerpc/powernv: Support firmware disable of RFI flush
powerpc/rfi-flush: Move the logic to avoid a redo into the debugfs
code
powerpc/rfi-flush: Make it possible to call setup_rfi_flush() again
powerpc/rfi-flush: Always enable fallback flush on pseries
powerpc/rfi-flush: Call setup_rfi_flush() after LPM migration
powerpc/pseries: Add new H_GET_CPU_CHARACTERISTICS flags
powerpc: Add security feature flags for Spectre/Meltdown
powerpc/pseries: Set or clear security feature flags
powerpc/powernv: Set or clear security feature flags
powerpc/64s: Move cpu_show_meltdown()
powerpc/64s: Enhance the information in cpu_show_meltdown()
powerpc/powernv: Use the security flags in pnv_setup_rfi_flush()
powerpc/pseries: Use the security flags in pseries_setup_rfi_flush()
powerpc/64s: Wire up cpu_show_spectre_v1()
powerpc/64s: Wire up cpu_show_spectre_v2()
powerpc/64s: Fix section mismatch warnings from setup_rfi_flush()
Nicholas Piggin (2):
powerpc/64s: Improve RFI L1-D cache flush fallback
powerpc/64s: Add support for a store forwarding barrier at kernel
entry/exit
arch/powerpc/include/asm/exception-64s.h | 29 ++++
arch/powerpc/include/asm/feature-fixups.h | 19 +++
arch/powerpc/include/asm/hvcall.h | 3 +
arch/powerpc/include/asm/paca.h | 3 +-
arch/powerpc/include/asm/security_features.h | 85 ++++++++++
arch/powerpc/include/asm/setup.h | 2 +-
arch/powerpc/kernel/Makefile | 2 +-
arch/powerpc/kernel/asm-offsets.c | 3 +-
arch/powerpc/kernel/exceptions-64s.S | 95 ++++++-----
arch/powerpc/kernel/security.c | 237 +++++++++++++++++++++++++++
arch/powerpc/kernel/setup_64.c | 48 ++----
arch/powerpc/kernel/vmlinux.lds.S | 14 ++
arch/powerpc/lib/feature-fixups.c | 124 +++++++++++++-
arch/powerpc/platforms/powernv/setup.c | 92 ++++++++---
arch/powerpc/platforms/pseries/mobility.c | 3 +
arch/powerpc/platforms/pseries/pseries.h | 2 +
arch/powerpc/platforms/pseries/setup.c | 81 +++++++--
arch/powerpc/xmon/xmon.c | 2 +
18 files changed, 721 insertions(+), 123 deletions(-)
create mode 100644 arch/powerpc/include/asm/security_features.h
create mode 100644 arch/powerpc/kernel/security.c
--
2.14.1
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: vsp1: Release buffers for each video node
Author: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
Date: Fri May 18 16:41:54 2018 -0400
Commit 372b2b0399fc ("media: v4l: vsp1: Release buffers in
start_streaming error path") introduced a helper to clean up buffers on
error paths, but inadvertently changed the code such that only the
output WPF buffers were cleaned, rather than the video node being
operated on.
Since then vsp1_video_cleanup_pipeline() has grown to perform both video
node cleanup, as well as pipeline cleanup. Split the implementation into
two distinct functions that perform the required work, so that each
video node can release its buffers correctly on streamoff. The pipe
cleanup that was performed in the vsp1_video_stop_streaming() (releasing
the pipe->dl) is moved to the function for clarity.
Fixes: 372b2b0399fc ("media: v4l: vsp1: Release buffers in start_streaming error path")
Cc: stable(a)vger.kernel.org # v4.14+
Signed-off-by: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas(a)ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung(a)kernel.org>
drivers/media/platform/vsp1/vsp1_video.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
---
diff --git a/drivers/media/platform/vsp1/vsp1_video.c b/drivers/media/platform/vsp1/vsp1_video.c
index c8c12223a267..ba89dd176a13 100644
--- a/drivers/media/platform/vsp1/vsp1_video.c
+++ b/drivers/media/platform/vsp1/vsp1_video.c
@@ -842,9 +842,8 @@ static int vsp1_video_setup_pipeline(struct vsp1_pipeline *pipe)
return 0;
}
-static void vsp1_video_cleanup_pipeline(struct vsp1_pipeline *pipe)
+static void vsp1_video_release_buffers(struct vsp1_video *video)
{
- struct vsp1_video *video = pipe->output->video;
struct vsp1_vb2_buffer *buffer;
unsigned long flags;
@@ -854,12 +853,18 @@ static void vsp1_video_cleanup_pipeline(struct vsp1_pipeline *pipe)
vb2_buffer_done(&buffer->buf.vb2_buf, VB2_BUF_STATE_ERROR);
INIT_LIST_HEAD(&video->irqqueue);
spin_unlock_irqrestore(&video->irqlock, flags);
+}
+
+static void vsp1_video_cleanup_pipeline(struct vsp1_pipeline *pipe)
+{
+ lockdep_assert_held(&pipe->lock);
/* Release our partition table allocation */
- mutex_lock(&pipe->lock);
kfree(pipe->part_table);
pipe->part_table = NULL;
- mutex_unlock(&pipe->lock);
+
+ vsp1_dl_list_put(pipe->dl);
+ pipe->dl = NULL;
}
static int vsp1_video_start_streaming(struct vb2_queue *vq, unsigned int count)
@@ -874,8 +879,9 @@ static int vsp1_video_start_streaming(struct vb2_queue *vq, unsigned int count)
if (pipe->stream_count == pipe->num_inputs) {
ret = vsp1_video_setup_pipeline(pipe);
if (ret < 0) {
- mutex_unlock(&pipe->lock);
+ vsp1_video_release_buffers(video);
vsp1_video_cleanup_pipeline(pipe);
+ mutex_unlock(&pipe->lock);
return ret;
}
@@ -925,13 +931,12 @@ static void vsp1_video_stop_streaming(struct vb2_queue *vq)
if (ret == -ETIMEDOUT)
dev_err(video->vsp1->dev, "pipeline stop timeout\n");
- vsp1_dl_list_put(pipe->dl);
- pipe->dl = NULL;
+ vsp1_video_cleanup_pipeline(pipe);
}
mutex_unlock(&pipe->lock);
media_pipeline_stop(&video->video.entity);
- vsp1_video_cleanup_pipeline(pipe);
+ vsp1_video_release_buffers(video);
vsp1_video_pipeline_put(pipe);
}
From: David Hildenbrand <david(a)redhat.com>
Subject: kasan: fix memory hotplug during boot
Using module_init() is wrong. E.g. ACPI adds and onlines memory before
our memory notifier gets registered.
This makes sure that ACPI memory detected during boot up will not result
in a kernel crash.
Easily reproducible with QEMU, just specify a DIMM when starting up.
Link: http://lkml.kernel.org/r/20180522100756.18478-3-david@redhat.com
Fixes: 786a8959912e ("kasan: disable memory hotplug")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kasan/kasan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN mm/kasan/kasan.c~kasan-fix-memory-hotplug-during-boot mm/kasan/kasan.c
--- a/mm/kasan/kasan.c~kasan-fix-memory-hotplug-during-boot
+++ a/mm/kasan/kasan.c
@@ -898,5 +898,5 @@ static int __init kasan_memhotplug_init(
return 0;
}
-module_init(kasan_memhotplug_init);
+core_initcall(kasan_memhotplug_init);
#endif
_
From: "Gustavo A. R. Silva" <gustavo(a)embeddedor.com>
Subject: kernel/sys.c: fix potential Spectre v1 issue
`resource' can be controlled by user-space, hence leading to a potential
exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
kernel/sys.c:1474 __do_compat_sys_old_getrlimit() warn: potential
spectre issue 'get_current()->signal->rlim' (local cap)
kernel/sys.c:1455 __do_sys_old_getrlimit() warn: potential spectre issue
'get_current()->signal->rlim' (local cap)
Fix this by sanitizing *resource* before using it to index
current->signal->rlim
Notice that given that speculation windows are large, the policy is to
kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Link: http://lkml.kernel.org/r/20180515030038.GA11822@embeddedor.com
Signed-off-by: Gustavo A. R. Silva <gustavo(a)embeddedor.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/sys.c | 5 +++++
1 file changed, 5 insertions(+)
diff -puN kernel/sys.c~kernel-sys-fix-potential-spectre-v1 kernel/sys.c
--- a/kernel/sys.c~kernel-sys-fix-potential-spectre-v1
+++ a/kernel/sys.c
@@ -71,6 +71,9 @@
#include <asm/io.h>
#include <asm/unistd.h>
+/* Hardening for Spectre-v1 */
+#include <linux/nospec.h>
+
#include "uid16.h"
#ifndef SET_UNALIGN_CTL
@@ -1453,6 +1456,7 @@ SYSCALL_DEFINE2(old_getrlimit, unsigned
if (resource >= RLIM_NLIMITS)
return -EINVAL;
+ resource = array_index_nospec(resource, RLIM_NLIMITS);
task_lock(current->group_leader);
x = current->signal->rlim[resource];
task_unlock(current->group_leader);
@@ -1472,6 +1476,7 @@ COMPAT_SYSCALL_DEFINE2(old_getrlimit, un
if (resource >= RLIM_NLIMITS)
return -EINVAL;
+ resource = array_index_nospec(resource, RLIM_NLIMITS);
task_lock(current->group_leader);
r = current->signal->rlim[resource];
task_unlock(current->group_leader);
_
From: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Subject: mm/kasan: don't vfree() nonexistent vm_area
KASAN uses different routines to map shadow for hot added memory and
memory obtained in boot process. Attempt to offline memory onlined by
normal boot process leads to this:
Trying to vfree() nonexistent vm area (000000005d3b34b9)
WARNING: CPU: 2 PID: 13215 at mm/vmalloc.c:1525 __vunmap+0x147/0x190
Call Trace:
kasan_mem_notifier+0xad/0xb9
notifier_call_chain+0x166/0x260
__blocking_notifier_call_chain+0xdb/0x140
__offline_pages+0x96a/0xb10
memory_subsys_offline+0x76/0xc0
device_offline+0xb8/0x120
store_mem_state+0xfa/0x120
kernfs_fop_write+0x1d5/0x320
__vfs_write+0xd4/0x530
vfs_write+0x105/0x340
SyS_write+0xb0/0x140
Obviously we can't call vfree() to free memory that wasn't allocated via
vmalloc(). Use find_vm_area() to see if we can call vfree().
Unfortunately it's a bit tricky to properly unmap and free shadow
allocated during boot, so we'll have to keep it. If memory will come
online again that shadow will be reused.
Matthew asked: how can you call vfree() on something that isn't a
vmalloc address?
vfree() is able to free any address returned by
__vmalloc_node_range(). And __vmalloc_node_range() gives you any
address you ask. It doesn't have to be an address in [VMALLOC_START,
VMALLOC_END] range.
That's also how the module_alloc()/module_memfree() works on
architectures that have designated area for modules.
[aryabinin(a)virtuozzo.com: improve comments]
Link: http://lkml.kernel.org/r/dabee6ab-3a7a-51cd-3b86-5468718e0390@virtuozzo.com
[akpm(a)linux-foundation.org: fix typos, reflow comment]
Link: http://lkml.kernel.org/r/20180201163349.8700-1-aryabinin@virtuozzo.com
Fixes: fa69b5989bb0 ("mm/kasan: add support for memory hotplug")
Signed-off-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Reported-by: Paul Menzel <pmenzel+linux-kasan-dev(a)molgen.mpg.de>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kasan/kasan.c | 63 +++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 61 insertions(+), 2 deletions(-)
diff -puN mm/kasan/kasan.c~mm-kasan-dont-vfree-nonexistent-vm_area mm/kasan/kasan.c
--- a/mm/kasan/kasan.c~mm-kasan-dont-vfree-nonexistent-vm_area
+++ a/mm/kasan/kasan.c
@@ -792,6 +792,40 @@ DEFINE_ASAN_SET_SHADOW(f5);
DEFINE_ASAN_SET_SHADOW(f8);
#ifdef CONFIG_MEMORY_HOTPLUG
+static bool shadow_mapped(unsigned long addr)
+{
+ pgd_t *pgd = pgd_offset_k(addr);
+ p4d_t *p4d;
+ pud_t *pud;
+ pmd_t *pmd;
+ pte_t *pte;
+
+ if (pgd_none(*pgd))
+ return false;
+ p4d = p4d_offset(pgd, addr);
+ if (p4d_none(*p4d))
+ return false;
+ pud = pud_offset(p4d, addr);
+ if (pud_none(*pud))
+ return false;
+
+ /*
+ * We can't use pud_large() or pud_huge(), the first one is
+ * arch-specific, the last one depends on HUGETLB_PAGE. So let's abuse
+ * pud_bad(), if pud is bad then it's bad because it's huge.
+ */
+ if (pud_bad(*pud))
+ return true;
+ pmd = pmd_offset(pud, addr);
+ if (pmd_none(*pmd))
+ return false;
+
+ if (pmd_bad(*pmd))
+ return true;
+ pte = pte_offset_kernel(pmd, addr);
+ return !pte_none(*pte);
+}
+
static int __meminit kasan_mem_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -813,6 +847,14 @@ static int __meminit kasan_mem_notifier(
case MEM_GOING_ONLINE: {
void *ret;
+ /*
+ * If shadow is mapped already than it must have been mapped
+ * during the boot. This could happen if we onlining previously
+ * offlined memory.
+ */
+ if (shadow_mapped(shadow_start))
+ return NOTIFY_OK;
+
ret = __vmalloc_node_range(shadow_size, PAGE_SIZE, shadow_start,
shadow_end, GFP_KERNEL,
PAGE_KERNEL, VM_NO_GUARD,
@@ -824,8 +866,25 @@ static int __meminit kasan_mem_notifier(
kmemleak_ignore(ret);
return NOTIFY_OK;
}
- case MEM_OFFLINE:
- vfree((void *)shadow_start);
+ case MEM_OFFLINE: {
+ struct vm_struct *vm;
+
+ /*
+ * shadow_start was either mapped during boot by kasan_init()
+ * or during memory online by __vmalloc_node_range().
+ * In the latter case we can use vfree() to free shadow.
+ * Non-NULL result of the find_vm_area() will tell us if
+ * that was the second case.
+ *
+ * Currently it's not possible to free shadow mapped
+ * during boot by kasan_init(). It's because the code
+ * to do that hasn't been written yet. So we'll just
+ * leak the memory.
+ */
+ vm = find_vm_area((void *)shadow_start);
+ if (vm)
+ vfree((void *)shadow_start);
+ }
}
return NOTIFY_OK;
_