The 'nr_pages' attribute of the 'msc' subdevices parses a comma-separated
list of window sizes, passed from userspace. However, there is a bug in
the string parsing logic wherein it doesn't exclude the comma character
from the range of characters as it consumes them. This leads to an
out-of-bounds access given a sufficiently long list. For example:
> # echo 8,8,8,8 > /sys/bus/intel_th/devices/0-msc0/nr_pages
> ==================================================================
> BUG: KASAN: slab-out-of-bounds in memchr+0x1e/0x40
> Read of size 1 at addr ffff8803ffcebcd1 by task sh/825
>
> CPU: 3 PID: 825 Comm: npktest.sh Tainted: G W 4.20.0-rc1+
> Call Trace:
> dump_stack+0x7c/0xc0
> print_address_description+0x6c/0x23c
> ? memchr+0x1e/0x40
> kasan_report.cold.5+0x241/0x308
> memchr+0x1e/0x40
> nr_pages_store+0x203/0xd00 [intel_th_msu]
Fix this by accounting for the comma character.
Signed-off-by: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Fixes: ba82664c134ef ("intel_th: Add Memory Storage Unit driver")
Cc: stable(a)vger.kernel.org # v4.4+
---
drivers/hwtracing/intel_th/msu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/intel_th/msu.c b/drivers/hwtracing/intel_th/msu.c
index d293e55553bd..ba7aaf421f36 100644
--- a/drivers/hwtracing/intel_th/msu.c
+++ b/drivers/hwtracing/intel_th/msu.c
@@ -1423,7 +1423,8 @@ nr_pages_store(struct device *dev, struct device_attribute *attr,
if (!end)
break;
- len -= end - p;
+ /* consume the number and the following comma, hence +1 */
+ len -= end - p + 1;
p = end + 1;
} while (len);
--
2.19.2
This is the start of the stable review cycle for the 4.19.11 release.
There are 44 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu Dec 20 16:39:02 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.11-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.11-rc1
Masahiro Yamada <yamada.masahiro(a)socionext.com>
x86/build: Fix compiler support check for CONFIG_RETPOLINE
Damien Le Moal <damien.lemoal(a)wdc.com>
dm zoned: Fix target BIO completion handling
Junwei Zhang <Jerry.Zhang(a)amd.com>
drm/amdgpu: update SMC firmware image for polaris10 variants
Alex Deucher <alexander.deucher(a)amd.com>
drm/amdgpu: update smu firmware images for VI variants (v2)
Alex Deucher <alexander.deucher(a)amd.com>
drm/amdgpu: add some additional vega10 pci ids
Alex Deucher <alexander.deucher(a)amd.com>
drm/amdkfd: add new vega10 pci ids
Kenneth Feng <kenneth.feng(a)amd.com>
drm/amdgpu/powerplay: Apply avfs cks-off voltages on VI
Chris Wilson <chris(a)chris-wilson.co.uk>
drm/i915/execlists: Apply a full mb before execution for Braswell
Tina Zhang <tina.zhang(a)intel.com>
drm/i915/gvt: Fix tiled memory decoding bug on BDW
Brian Norris <briannorris(a)chromium.org>
Revert "drm/rockchip: Allow driver to be shutdown on reboot/kexec"
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/kms/nv50-: also flush fb writes when rewinding push buffer
Lyude Paul <lyude(a)redhat.com>
drm/nouveau/kms: Fix memory leak in nv50_mstm_del()
Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
powerpc: Look for "stdout-path" when setting up legacy consoles
Radu Rendec <radu.rendec(a)gmail.com>
powerpc/msi: Fix NULL pointer access in teardown code
Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
media: vb2: don't call __vb2_queue_cancel if vb2_start_streaming failed
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
tracing: Fix memory leak of instance function hash filters
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
tracing: Fix memory leak in set_trigger_filter()
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
tracing: Fix memory leak in create_filter()
Mike Snitzer <snitzer(a)redhat.com>
dm: call blk_queue_split() to impose device limits on bios
Mike Snitzer <snitzer(a)redhat.com>
dm cache metadata: verify cache has blocks in blocks_are_clean_separate_dirty()
Mike Snitzer <snitzer(a)redhat.com>
dm thin: send event about thin-pool state change _after_ making it
Stefan Wahren <stefan.wahren(a)i2se.com>
ARM: dts: bcm2837: Fix polarity of wifi reset GPIOs
Lubomir Rintel <lkundrak(a)v3.sk>
ARM: mmp/mmp2: fix cpu_is_mmp2() on mmp2-dt
Chad Austin <chadaustin(a)fb.com>
fuse: continue to send FUSE_RELEASEDIR when FUSE_OPEN returns ENOSYS
Alek Du <alek.du(a)intel.com>
mmc: sdhci: fix the timeout check window for clock and reset
Faiz Abbas <faiz_abbas(a)ti.com>
mmc: sdhci-omap: Fix DCRC error handling during tuning
Wolfram Sang <wsa+renesas(a)sang-engineering.com>
mmc: core: use mrq->sbc when sending CMD23 for RPMB
Aaro Koskinen <aaro.koskinen(a)iki.fi>
MMC: OMAP: fix broken MMC on OMAP15XX/OMAP5910/OMAP310
Amir Goldstein <amir73il(a)gmail.com>
ovl: fix missing override creds in link of a metacopy upper
Amir Goldstein <amir73il(a)gmail.com>
ovl: fix decode of dir file handle with multi lower layers
Keith Busch <keith.busch(a)intel.com>
block/bio: Do not zero user pages
Robin Murphy <robin.murphy(a)arm.com>
arm64: dma-mapping: Fix FORCE_CONTIGUOUS buffer clearing
Andrea Arcangeli <aarcange(a)redhat.com>
userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered
Piotr Jaroszynski <pjaroszynski(a)nvidia.com>
fs/iomap.c: get/put the page in iomap_page_create/release()
Thierry Reding <treding(a)nvidia.com>
scripts/spdxcheck.py: always open files in binary mode
Jeff Moyer <jmoyer(a)redhat.com>
aio: fix spectre gadget in lookup_ioctx
Chen-Yu Tsai <wens(a)csie.org>
pinctrl: sunxi: a83t: Fix IRQ offset typo for PH11
Arnd Bergmann <arnd(a)arndb.de>
drm/msm: fix address space warning
Arnd Bergmann <arnd(a)arndb.de>
ARM: dts: qcom-apq8064-arrow-sd-600eval fix graph_endpoint warning
Arnd Bergmann <arnd(a)arndb.de>
i2c: aspeed: fix build warning
Arnd Bergmann <arnd(a)arndb.de>
slimbus: ngd: mark PM functions as __maybe_unused
Lubomir Rintel <lkundrak(a)v3.sk>
staging: olpc_dcon: add a missing dependency
Arnd Bergmann <arnd(a)arndb.de>
scsi: raid_attrs: fix unused variable warning
Vincent Guittot <vincent.guittot(a)linaro.org>
sched/pelt: Fix warning and clean up IRQ PELT config
-------------
Diffstat:
Makefile | 4 +-
arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts | 2 +-
arch/arm/boot/dts/bcm2837-rpi-3-b.dts | 2 +-
.../arm/boot/dts/qcom-apq8064-arrow-sd-600eval.dts | 5 +
arch/arm/mach-mmp/cputype.h | 6 +-
arch/arm64/mm/dma-mapping.c | 2 +-
arch/powerpc/kernel/legacy_serial.c | 6 +-
arch/powerpc/kernel/msi.c | 7 +-
arch/x86/Makefile | 10 +-
block/bio.c | 3 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c | 36 +++++-
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 +
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 6 +
drivers/gpu/drm/amd/powerplay/inc/smu7_ppsmc.h | 2 +
.../drm/amd/powerplay/smumgr/polaris10_smumgr.c | 6 +
drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c | 3 +
drivers/gpu/drm/i915/gvt/fb_decoder.c | 2 +-
drivers/gpu/drm/i915/intel_lrc.c | 7 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_dbg.c | 8 +-
drivers/gpu/drm/nouveau/dispnv50/disp.c | 30 +++--
drivers/gpu/drm/rockchip/rockchip_drm_drv.c | 6 -
drivers/i2c/busses/i2c-aspeed.c | 4 +-
drivers/md/dm-cache-metadata.c | 4 +
drivers/md/dm-thin.c | 68 ++++++------
drivers/md/dm-zoned-target.c | 122 +++++++--------------
drivers/md/dm.c | 2 +
drivers/media/common/videobuf2/videobuf2-core.c | 4 +-
drivers/mmc/core/block.c | 15 ++-
drivers/mmc/host/omap.c | 11 +-
drivers/mmc/host/sdhci-omap.c | 12 +-
drivers/mmc/host/sdhci.c | 18 ++-
drivers/pinctrl/sunxi/pinctrl-sun8i-a83t.c | 2 +-
drivers/scsi/raid_class.c | 4 +-
drivers/slimbus/qcom-ngd-ctrl.c | 6 +-
drivers/staging/olpc_dcon/Kconfig | 1 +
fs/aio.c | 2 +
fs/fuse/dir.c | 2 +-
fs/fuse/file.c | 21 ++--
fs/fuse/fuse_i.h | 2 +-
fs/iomap.c | 7 ++
fs/overlayfs/dir.c | 14 ++-
fs/overlayfs/export.c | 6 +-
fs/userfaultfd.c | 3 +-
init/Kconfig | 5 +
kernel/sched/core.c | 7 +-
kernel/sched/fair.c | 2 +-
kernel/sched/pelt.c | 2 +-
kernel/sched/pelt.h | 2 +-
kernel/sched/sched.h | 5 +-
kernel/trace/ftrace.c | 1 +
kernel/trace/trace_events_filter.c | 5 +-
kernel/trace/trace_events_trigger.c | 6 +-
scripts/spdxcheck.py | 6 +-
53 files changed, 311 insertions(+), 219 deletions(-)
This is a note to let you know that I've just added the patch titled
usb: r8a66597: Fix a possible concurrency use-after-free bug in
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From c85400f886e3d41e69966470879f635a2b50084c Mon Sep 17 00:00:00 2001
From: Jia-Ju Bai <baijiaju1990(a)gmail.com>
Date: Tue, 18 Dec 2018 20:04:25 +0800
Subject: usb: r8a66597: Fix a possible concurrency use-after-free bug in
r8a66597_endpoint_disable()
The function r8a66597_endpoint_disable() and r8a66597_urb_enqueue() may
be concurrently executed.
The two functions both access a possible shared variable "hep->hcpriv".
This shared variable is freed by r8a66597_endpoint_disable() via the
call path:
r8a66597_endpoint_disable
kfree(hep->hcpriv) (line 1995 in Linux-4.19)
This variable is read by r8a66597_urb_enqueue() via the call path:
r8a66597_urb_enqueue
spin_lock_irqsave(&r8a66597->lock)
init_pipe_info
enable_r8a66597_pipe
pipe = hep->hcpriv (line 802 in Linux-4.19)
The read operation is protected by a spinlock, but the free operation
is not protected by this spinlock, thus a concurrency use-after-free bug
may occur.
To fix this bug, the spin-lock and spin-unlock function calls in
r8a66597_endpoint_disable() are moved to protect the free operation.
Signed-off-by: Jia-Ju Bai <baijiaju1990(a)gmail.com>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/r8a66597-hcd.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/r8a66597-hcd.c b/drivers/usb/host/r8a66597-hcd.c
index 984892dd72f5..42668aeca57c 100644
--- a/drivers/usb/host/r8a66597-hcd.c
+++ b/drivers/usb/host/r8a66597-hcd.c
@@ -1979,6 +1979,8 @@ static int r8a66597_urb_dequeue(struct usb_hcd *hcd, struct urb *urb,
static void r8a66597_endpoint_disable(struct usb_hcd *hcd,
struct usb_host_endpoint *hep)
+__acquires(r8a66597->lock)
+__releases(r8a66597->lock)
{
struct r8a66597 *r8a66597 = hcd_to_r8a66597(hcd);
struct r8a66597_pipe *pipe = (struct r8a66597_pipe *)hep->hcpriv;
@@ -1991,13 +1993,14 @@ static void r8a66597_endpoint_disable(struct usb_hcd *hcd,
return;
pipenum = pipe->info.pipenum;
+ spin_lock_irqsave(&r8a66597->lock, flags);
if (pipenum == 0) {
kfree(hep->hcpriv);
hep->hcpriv = NULL;
+ spin_unlock_irqrestore(&r8a66597->lock, flags);
return;
}
- spin_lock_irqsave(&r8a66597->lock, flags);
pipe_stop(r8a66597, pipe);
pipe_irq_disable(r8a66597, pipenum);
disable_irq_empty(r8a66597, pipenum);
--
2.20.1
This is a note to let you know that I've just added the patch titled
driver core: Add missing dev->bus->need_parent_lock checks
to my driver-core git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git
in the driver-core-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From e121a833745b4708b660e3fe6776129c2956b041 Mon Sep 17 00:00:00 2001
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Thu, 13 Dec 2018 19:27:47 +0100
Subject: driver core: Add missing dev->bus->need_parent_lock checks
__device_release_driver() has to check dev->bus->need_parent_lock
before dropping the parent lock and acquiring it again as it may
attempt to drop a lock that hasn't been acquired or lock a device
that shouldn't be locked and create a lock imbalance.
Fixes: 8c97a46af04b (driver core: hold dev's parent lock when needed)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Cc: stable <stable(a)vger.kernel.org>
Reviewed-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/base/dd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 88713f182086..8ac10af17c00 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -933,11 +933,11 @@ static void __device_release_driver(struct device *dev, struct device *parent)
if (drv) {
while (device_links_busy(dev)) {
device_unlock(dev);
- if (parent)
+ if (parent && dev->bus->need_parent_lock)
device_unlock(parent);
device_links_unbind_consumers(dev);
- if (parent)
+ if (parent && dev->bus->need_parent_lock)
device_lock(parent);
device_lock(dev);
--
2.20.1
Hi Marc,
This is wrong: commit 6022fcc0e87a0eb5e9a72b15ed70dd29ebcb7343
The above is not my original patch and it should not be tagged for stable,
as it introduces the same kind of bug I intended to fix:
array_index_nospec() can now return kvm->arch.vgic.nr_spis + VGIC_NR_PRIVATE_IRQS
and this is not what you want. So, in this case the following line of code
is just fine as it is:
intid = array_index_nospec(intid, kvm->arch.vgic.nr_spis + VGIC_NR_PRIVATE_IRQS);
As the commit log says, my patch fixes:
commit 41b87599c74300027f305d7b34368ec558978ff2
not both:
commit 41b87599c74300027f305d7b34368ec558978ff2
and
commit bea2ef803ade3359026d5d357348842bca9edcf1
If you want to apply the fix on top of bea2ef803ade3359026d5d357348842bca9edcf1
then you should apply this instead:
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index bb1a83345741..e607547c7bb0 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -103,7 +103,7 @@ struct vgic_irq *vgic_get_irq(struct kvm *kvm, struct kvm_vcpu *vcpu,
{
/* SGIs and PPIs */
if (intid <= VGIC_MAX_PRIVATE) {
- intid = array_index_nospec(intid, VGIC_MAX_PRIVATE);
+ intid = array_index_nospec(intid, VGIC_MAX_PRIVATE + 1);
return &vcpu->arch.vgic_cpu.private_irqs[intid];
}
The commit log should remain the same.
Thanks
--
Gustavo
As part of my work for the Civil Infrastructure Platform, I've been
tracking security issues in the kernel and trying to ensure that the
fixes are applied to stable branches as necessary.
The "kernel-sec" repository at
<https://gitlab.com/cip-project/cip-kernel/cip-kernel-sec> contains
information about known issues and scripts to aid in maintaining and
viewing that information. Issues are identified by CVE ID and their
status is recorded for mainline and all live stable branches.
I import most of the information from distribution security trackers,
and from upstream commit references in stable branch commit messages.
Manual editing is needed mostly to correct errors in these sources, or
where the commits fixing an issue in a stable branch don't correspond
exactly to the commits fixing it in mainline.
I recently added a local web application that allows browsing the
status of all branches and issues, complete with links to references
and related commits. There is also a simple reporting script that
lists open issues for each branch.
If you're interested in security support for stable branches, please
take a look at this.
I would welcome merge requests to add to the issue data or to improve
the scripts.
Ben.
--
Ben Hutchings, Software Developer Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
Manchester, M1 2HF, United Kingdom
This is a note to let you know that I've just added the patch titled
binder: fix use-after-free due to ksys_close() during fdget()
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 80cd795630d6526ba729a089a435bf74a57af927 Mon Sep 17 00:00:00 2001
From: Todd Kjos <tkjos(a)android.com>
Date: Fri, 14 Dec 2018 15:58:21 -0800
Subject: binder: fix use-after-free due to ksys_close() during fdget()
44d8047f1d8 ("binder: use standard functions to allocate fds")
exposed a pre-existing issue in the binder driver.
fdget() is used in ksys_ioctl() as a performance optimization.
One of the rules associated with fdget() is that ksys_close() must
not be called between the fdget() and the fdput(). There is a case
where this requirement is not met in the binder driver which results
in the reference count dropping to 0 when the device is still in
use. This can result in use-after-free or other issues.
If userpace has passed a file-descriptor for the binder driver using
a BINDER_TYPE_FDA object, then kys_close() is called on it when
handling a binder_ioctl(BC_FREE_BUFFER) command. This violates
the assumptions for using fdget().
The problem is fixed by deferring the close using task_work_add(). A
new variant of __close_fd() was created that returns a struct file
with a reference. The fput() is deferred instead of using ksys_close().
Fixes: 44d8047f1d87a ("binder: use standard functions to allocate fds")
Suggested-by: Al Viro <viro(a)zeniv.linux.org.uk>
Signed-off-by: Todd Kjos <tkjos(a)google.com>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/android/binder.c | 63 ++++++++++++++++++++++++++++++++++++++--
fs/file.c | 29 ++++++++++++++++++
include/linux/fdtable.h | 1 +
3 files changed, 91 insertions(+), 2 deletions(-)
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index d653e8a474fc..210940bd0457 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -72,6 +72,7 @@
#include <linux/spinlock.h>
#include <linux/ratelimit.h>
#include <linux/syscalls.h>
+#include <linux/task_work.h>
#include <uapi/linux/android/binder.h>
@@ -2170,6 +2171,64 @@ static bool binder_validate_fixup(struct binder_buffer *b,
return (fixup_offset >= last_min_offset);
}
+/**
+ * struct binder_task_work_cb - for deferred close
+ *
+ * @twork: callback_head for task work
+ * @fd: fd to close
+ *
+ * Structure to pass task work to be handled after
+ * returning from binder_ioctl() via task_work_add().
+ */
+struct binder_task_work_cb {
+ struct callback_head twork;
+ struct file *file;
+};
+
+/**
+ * binder_do_fd_close() - close list of file descriptors
+ * @twork: callback head for task work
+ *
+ * It is not safe to call ksys_close() during the binder_ioctl()
+ * function if there is a chance that binder's own file descriptor
+ * might be closed. This is to meet the requirements for using
+ * fdget() (see comments for __fget_light()). Therefore use
+ * task_work_add() to schedule the close operation once we have
+ * returned from binder_ioctl(). This function is a callback
+ * for that mechanism and does the actual ksys_close() on the
+ * given file descriptor.
+ */
+static void binder_do_fd_close(struct callback_head *twork)
+{
+ struct binder_task_work_cb *twcb = container_of(twork,
+ struct binder_task_work_cb, twork);
+
+ fput(twcb->file);
+ kfree(twcb);
+}
+
+/**
+ * binder_deferred_fd_close() - schedule a close for the given file-descriptor
+ * @fd: file-descriptor to close
+ *
+ * See comments in binder_do_fd_close(). This function is used to schedule
+ * a file-descriptor to be closed after returning from binder_ioctl().
+ */
+static void binder_deferred_fd_close(int fd)
+{
+ struct binder_task_work_cb *twcb;
+
+ twcb = kzalloc(sizeof(*twcb), GFP_KERNEL);
+ if (!twcb)
+ return;
+ init_task_work(&twcb->twork, binder_do_fd_close);
+ __close_fd_get_file(fd, &twcb->file);
+ if (twcb->file)
+ task_work_add(current, &twcb->twork, true);
+ else
+ kfree(twcb);
+}
+
static void binder_transaction_buffer_release(struct binder_proc *proc,
struct binder_buffer *buffer,
binder_size_t *failed_at)
@@ -2309,7 +2368,7 @@ static void binder_transaction_buffer_release(struct binder_proc *proc,
}
fd_array = (u32 *)(parent_buffer + (uintptr_t)fda->parent_offset);
for (fd_index = 0; fd_index < fda->num_fds; fd_index++)
- ksys_close(fd_array[fd_index]);
+ binder_deferred_fd_close(fd_array[fd_index]);
} break;
default:
pr_err("transaction release %d bad object type %x\n",
@@ -3928,7 +3987,7 @@ static int binder_apply_fd_fixups(struct binder_transaction *t)
} else if (ret) {
u32 *fdp = (u32 *)(t->buffer->data + fixup->offset);
- ksys_close(*fdp);
+ binder_deferred_fd_close(*fdp);
}
list_del(&fixup->fixup_entry);
kfree(fixup);
diff --git a/fs/file.c b/fs/file.c
index 7ffd6e9d103d..8d059d8973e9 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -640,6 +640,35 @@ int __close_fd(struct files_struct *files, unsigned fd)
}
EXPORT_SYMBOL(__close_fd); /* for ksys_close() */
+/*
+ * variant of __close_fd that gets a ref on the file for later fput
+ */
+int __close_fd_get_file(unsigned int fd, struct file **res)
+{
+ struct files_struct *files = current->files;
+ struct file *file;
+ struct fdtable *fdt;
+
+ spin_lock(&files->file_lock);
+ fdt = files_fdtable(files);
+ if (fd >= fdt->max_fds)
+ goto out_unlock;
+ file = fdt->fd[fd];
+ if (!file)
+ goto out_unlock;
+ rcu_assign_pointer(fdt->fd[fd], NULL);
+ __put_unused_fd(files, fd);
+ spin_unlock(&files->file_lock);
+ get_file(file);
+ *res = file;
+ return filp_close(file, files);
+
+out_unlock:
+ spin_unlock(&files->file_lock);
+ *res = NULL;
+ return -ENOENT;
+}
+
void do_close_on_exec(struct files_struct *files)
{
unsigned i;
diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index 41615f38bcff..f07c55ea0c22 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -121,6 +121,7 @@ extern void __fd_install(struct files_struct *files,
unsigned int fd, struct file *file);
extern int __close_fd(struct files_struct *files,
unsigned int fd);
+extern int __close_fd_get_file(unsigned int fd, struct file **res);
extern struct kmem_cache *files_cachep;
--
2.20.1
On Tue, 18 Dec 2018 at 21:41, Sasha Levin <sashal(a)kernel.org> wrote:
>
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a -stable tag.
> The stable tag indicates that it's relevant for the following trees: all
>
> The bot has tested the following trees: v4.19.10, v4.14.89, v4.9.146, v4.4.168, v3.18.130,
>
Please disregard this patch for -stable until we decide how we are
going to fix the 32-bit array packing issue.
> v4.19.10: Build OK!
> v4.14.89: Build OK!
> v4.9.146: Failed to apply! Possible dependencies:
> 2f74f09bce4f ("efi: parse ARM processor error")
> 5b53696a30d5 ("ACPI / APEI: Switch to use new generic UUID API")
> bbcc2e7b642e ("ras: acpi/apei: cper: add support for generic data v3 structure")
> c0020756315e ("efi: switch to use new generic UUID API")
>
> v4.4.168: Failed to apply! Possible dependencies:
> 2c23b73c2d02 ("x86/efi: Prepare GOP handling code for reuse as generic code")
> 2f74f09bce4f ("efi: parse ARM processor error")
> 5b53696a30d5 ("ACPI / APEI: Switch to use new generic UUID API")
> ba7e34b1bbd2 ("include/linux/efi.h: redefine type, constant, macro from generic code")
> bbcc2e7b642e ("ras: acpi/apei: cper: add support for generic data v3 structure")
> c0020756315e ("efi: switch to use new generic UUID API")
>
> v3.18.130: Failed to apply! Possible dependencies:
> 1bd0abb0c924 ("arm64/efi: set EFI_ALLOC_ALIGN to 64 KB")
> 23a0d4e8fa6d ("efi: Disable interrupts around EFI calls, not in the epilog/prolog calls")
> 2c23b73c2d02 ("x86/efi: Prepare GOP handling code for reuse as generic code")
> 2f74f09bce4f ("efi: parse ARM processor error")
> 4c62360d7562 ("efi: Handle memory error structures produced based on old versions of standard")
> 4ee20980812b ("arm64: fix data type for physical address")
> 5b53696a30d5 ("ACPI / APEI: Switch to use new generic UUID API")
> 60305db98845 ("arm64/efi: move virtmap init to early initcall")
> 744937b0b12a ("efi: Clean up the efi_call_phys_[prolog|epilog]() save/restore interaction")
> 790a2ee24278 ("Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into core/efi")
> 8a53554e12e9 ("x86/efi: Fix multiple GOP device support")
> 8ce837cee8f5 ("arm64/mm: add create_pgd_mapping() to create private page tables")
> 9679be103108 ("arm64/efi: remove idmap manipulations from UEFI code")
> a352ea3e197b ("arm64/efi: set PE/COFF file alignment to 512 bytes")
> b05b9f5f9dcf ("x86, mirror: x86 enabling - find mirrored memory ranges")
> ba7e34b1bbd2 ("include/linux/efi.h: redefine type, constant, macro from generic code")
> bbcc2e7b642e ("ras: acpi/apei: cper: add support for generic data v3 structure")
> c0020756315e ("efi: switch to use new generic UUID API")
> d1ae8c005792 ("arm64: dmi: Add SMBIOS/DMI support")
> da141706aea5 ("arm64: add better page protections to arm64")
> e1e1fddae74b ("arm64/mm: add explicit struct_mm argument to __create_mapping()")
> ea6bc80d1819 ("arm64/efi: set PE/COFF section alignment to 4 KB")
> f3cdfd239da5 ("arm64/efi: move SetVirtualAddressMap() to UEFI stub")
>
>
> How should we proceed with this patch?
>
> --
> Thanks,
> Sasha
This is a note to let you know that I've just added the patch titled
driver core: Add missing dev->bus->need_parent_lock checks
to my driver-core git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git
in the driver-core-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the driver-core-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From e121a833745b4708b660e3fe6776129c2956b041 Mon Sep 17 00:00:00 2001
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Thu, 13 Dec 2018 19:27:47 +0100
Subject: driver core: Add missing dev->bus->need_parent_lock checks
__device_release_driver() has to check dev->bus->need_parent_lock
before dropping the parent lock and acquiring it again as it may
attempt to drop a lock that hasn't been acquired or lock a device
that shouldn't be locked and create a lock imbalance.
Fixes: 8c97a46af04b (driver core: hold dev's parent lock when needed)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Cc: stable <stable(a)vger.kernel.org>
Reviewed-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/base/dd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 88713f182086..8ac10af17c00 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -933,11 +933,11 @@ static void __device_release_driver(struct device *dev, struct device *parent)
if (drv) {
while (device_links_busy(dev)) {
device_unlock(dev);
- if (parent)
+ if (parent && dev->bus->need_parent_lock)
device_unlock(parent);
device_links_unbind_consumers(dev);
- if (parent)
+ if (parent && dev->bus->need_parent_lock)
device_lock(parent);
device_lock(dev);
--
2.20.1
Hi Sasha,
> -----Original Message-----
> From: Sasha Levin [mailto:sashal@kernel.org]
> Sent: Wednesday, December 19, 2018 4:25 AM
> To: Sasha Levin <sashal(a)kernel.org>; Daniel Lezcano <daniel.lezcano(a)linaro.org>; Alexey Brodkin <alexey.brodkin(a)synopsys.com>;
> tglx(a)linutronix.de
> Cc: linux-kernel(a)vger.kernel.org; Daniel Lezcano <daniel.lezcano(a)linaro.org>; Vineet Gupta <vineet.gupta1(a)synopsys.com>;
> Thomas Gleixner <tglx(a)linutronix.de>; stable(a)vger.kernel.org; stable(a)vger.kernel.org
> Subject: Re: [PATCH 12/25] clocksource/drivers/arc_timer: Utilize generic sched_clock
>
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a -stable tag.
> The stable tag indicates that it's relevant for the following trees: all
>
> The bot has tested the following trees: v4.19.10, v4.14.89, v4.9.146, v4.4.168, v3.18.130,
>
> v4.19.10: Build OK!
> v4.14.89: Failed to apply! Possible dependencies:
> Unable to calculate
Here we just need a bit updated hunk due to missing [1] which was only introduced in v4.15:
-------------------------->8------------------------
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -299,6 +299,7 @@ config CLKSRC_MPS2
config ARC_TIMERS
bool "Support for 32-bit TIMERn counters in ARC Cores" if COMPILE_TEST
depends on GENERIC_CLOCKEVENTS
+ depends on GENERIC_SCHED_CLOCK
select TIMER_OF
help
These are legacy 32-bit TIMER0 and TIMER1 counters found on all ARC cores
-------------------------->8------------------------
> v4.9.146: Failed to apply! Possible dependencies:
> v4.4.168: Failed to apply! Possible dependencies:
> v3.18.130: Failed to apply! Possible dependencies:
Everything below v4.10 we'll need to drop as ARC timers were only imported in v4.10, see [2].
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
-Alexey
This is a note to let you know that I've just added the patch titled
binder: fix use-after-free due to ksys_close() during fdget()
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 80cd795630d6526ba729a089a435bf74a57af927 Mon Sep 17 00:00:00 2001
From: Todd Kjos <tkjos(a)android.com>
Date: Fri, 14 Dec 2018 15:58:21 -0800
Subject: binder: fix use-after-free due to ksys_close() during fdget()
44d8047f1d8 ("binder: use standard functions to allocate fds")
exposed a pre-existing issue in the binder driver.
fdget() is used in ksys_ioctl() as a performance optimization.
One of the rules associated with fdget() is that ksys_close() must
not be called between the fdget() and the fdput(). There is a case
where this requirement is not met in the binder driver which results
in the reference count dropping to 0 when the device is still in
use. This can result in use-after-free or other issues.
If userpace has passed a file-descriptor for the binder driver using
a BINDER_TYPE_FDA object, then kys_close() is called on it when
handling a binder_ioctl(BC_FREE_BUFFER) command. This violates
the assumptions for using fdget().
The problem is fixed by deferring the close using task_work_add(). A
new variant of __close_fd() was created that returns a struct file
with a reference. The fput() is deferred instead of using ksys_close().
Fixes: 44d8047f1d87a ("binder: use standard functions to allocate fds")
Suggested-by: Al Viro <viro(a)zeniv.linux.org.uk>
Signed-off-by: Todd Kjos <tkjos(a)google.com>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/android/binder.c | 63 ++++++++++++++++++++++++++++++++++++++--
fs/file.c | 29 ++++++++++++++++++
include/linux/fdtable.h | 1 +
3 files changed, 91 insertions(+), 2 deletions(-)
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index d653e8a474fc..210940bd0457 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -72,6 +72,7 @@
#include <linux/spinlock.h>
#include <linux/ratelimit.h>
#include <linux/syscalls.h>
+#include <linux/task_work.h>
#include <uapi/linux/android/binder.h>
@@ -2170,6 +2171,64 @@ static bool binder_validate_fixup(struct binder_buffer *b,
return (fixup_offset >= last_min_offset);
}
+/**
+ * struct binder_task_work_cb - for deferred close
+ *
+ * @twork: callback_head for task work
+ * @fd: fd to close
+ *
+ * Structure to pass task work to be handled after
+ * returning from binder_ioctl() via task_work_add().
+ */
+struct binder_task_work_cb {
+ struct callback_head twork;
+ struct file *file;
+};
+
+/**
+ * binder_do_fd_close() - close list of file descriptors
+ * @twork: callback head for task work
+ *
+ * It is not safe to call ksys_close() during the binder_ioctl()
+ * function if there is a chance that binder's own file descriptor
+ * might be closed. This is to meet the requirements for using
+ * fdget() (see comments for __fget_light()). Therefore use
+ * task_work_add() to schedule the close operation once we have
+ * returned from binder_ioctl(). This function is a callback
+ * for that mechanism and does the actual ksys_close() on the
+ * given file descriptor.
+ */
+static void binder_do_fd_close(struct callback_head *twork)
+{
+ struct binder_task_work_cb *twcb = container_of(twork,
+ struct binder_task_work_cb, twork);
+
+ fput(twcb->file);
+ kfree(twcb);
+}
+
+/**
+ * binder_deferred_fd_close() - schedule a close for the given file-descriptor
+ * @fd: file-descriptor to close
+ *
+ * See comments in binder_do_fd_close(). This function is used to schedule
+ * a file-descriptor to be closed after returning from binder_ioctl().
+ */
+static void binder_deferred_fd_close(int fd)
+{
+ struct binder_task_work_cb *twcb;
+
+ twcb = kzalloc(sizeof(*twcb), GFP_KERNEL);
+ if (!twcb)
+ return;
+ init_task_work(&twcb->twork, binder_do_fd_close);
+ __close_fd_get_file(fd, &twcb->file);
+ if (twcb->file)
+ task_work_add(current, &twcb->twork, true);
+ else
+ kfree(twcb);
+}
+
static void binder_transaction_buffer_release(struct binder_proc *proc,
struct binder_buffer *buffer,
binder_size_t *failed_at)
@@ -2309,7 +2368,7 @@ static void binder_transaction_buffer_release(struct binder_proc *proc,
}
fd_array = (u32 *)(parent_buffer + (uintptr_t)fda->parent_offset);
for (fd_index = 0; fd_index < fda->num_fds; fd_index++)
- ksys_close(fd_array[fd_index]);
+ binder_deferred_fd_close(fd_array[fd_index]);
} break;
default:
pr_err("transaction release %d bad object type %x\n",
@@ -3928,7 +3987,7 @@ static int binder_apply_fd_fixups(struct binder_transaction *t)
} else if (ret) {
u32 *fdp = (u32 *)(t->buffer->data + fixup->offset);
- ksys_close(*fdp);
+ binder_deferred_fd_close(*fdp);
}
list_del(&fixup->fixup_entry);
kfree(fixup);
diff --git a/fs/file.c b/fs/file.c
index 7ffd6e9d103d..8d059d8973e9 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -640,6 +640,35 @@ int __close_fd(struct files_struct *files, unsigned fd)
}
EXPORT_SYMBOL(__close_fd); /* for ksys_close() */
+/*
+ * variant of __close_fd that gets a ref on the file for later fput
+ */
+int __close_fd_get_file(unsigned int fd, struct file **res)
+{
+ struct files_struct *files = current->files;
+ struct file *file;
+ struct fdtable *fdt;
+
+ spin_lock(&files->file_lock);
+ fdt = files_fdtable(files);
+ if (fd >= fdt->max_fds)
+ goto out_unlock;
+ file = fdt->fd[fd];
+ if (!file)
+ goto out_unlock;
+ rcu_assign_pointer(fdt->fd[fd], NULL);
+ __put_unused_fd(files, fd);
+ spin_unlock(&files->file_lock);
+ get_file(file);
+ *res = file;
+ return filp_close(file, files);
+
+out_unlock:
+ spin_unlock(&files->file_lock);
+ *res = NULL;
+ return -ENOENT;
+}
+
void do_close_on_exec(struct files_struct *files)
{
unsigned i;
diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index 41615f38bcff..f07c55ea0c22 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -121,6 +121,7 @@ extern void __fd_install(struct files_struct *files,
unsigned int fd, struct file *file);
extern int __close_fd(struct files_struct *files,
unsigned int fd);
+extern int __close_fd_get_file(unsigned int fd, struct file **res);
extern struct kmem_cache *files_cachep;
--
2.20.1
In commit bc73905abf770192 ("[SCSI] lpfc 8.3.16: SLI Additions, updates,
and code cleanup"), lpfc_memcpy_to_slim() have switched memcpy_toio() to
__write32_copy() in order to prevent unaligned 64 bit copy. Recently, we
found that lpfc_memcpy_from_slim() have similar issues, so let it switch
memcpy_fromio() to __read32_copy().
As maintainer says, it seems that we can hardly see a real "unaligned 64
bit copy", but this patch is still useful. Because in our tests we found
that lpfc doesn't support 128 bit access, but some optimized memcpy()
use 128 bit access (at lease on Loongson).
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
---
V2: Update commit message.
drivers/scsi/lpfc/lpfc_compat.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_compat.h b/drivers/scsi/lpfc/lpfc_compat.h
index 43cf46a..0cd1e3c 100644
--- a/drivers/scsi/lpfc/lpfc_compat.h
+++ b/drivers/scsi/lpfc/lpfc_compat.h
@@ -91,8 +91,8 @@ lpfc_memcpy_to_slim( void __iomem *dest, void *src, unsigned int bytes)
static inline void
lpfc_memcpy_from_slim( void *dest, void __iomem *src, unsigned int bytes)
{
- /* actually returns 1 byte past dest */
- memcpy_fromio( dest, src, bytes);
+ /* convert bytes in argument list to word count for copy function */
+ __ioread32_copy(dest, src, bytes / sizeof(uint32_t));
}
#endif /* __BIG_ENDIAN */
--
2.7.0
This is a note to let you know that I've just added the patch titled
serial: uartps: Fix interrupt mask issue to handle the RX interrupts
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the tty-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 260683137ab5276113fc322fdbbc578024185fee Mon Sep 17 00:00:00 2001
From: Nava kishore Manne <nava.manne(a)xilinx.com>
Date: Tue, 18 Dec 2018 13:18:42 +0100
Subject: serial: uartps: Fix interrupt mask issue to handle the RX interrupts
properly
This patch Correct the RX interrupt mask value to handle the
RX interrupts properly.
Fixes: c8dbdc842d30 ("serial: xuartps: Rewrite the interrupt handling logic")
Signed-off-by: Nava kishore Manne <nava.manne(a)xilinx.com>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Michal Simek <michal.simek(a)xilinx.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/xilinx_uartps.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/xilinx_uartps.c b/drivers/tty/serial/xilinx_uartps.c
index c6d38617d622..094f2958cb2b 100644
--- a/drivers/tty/serial/xilinx_uartps.c
+++ b/drivers/tty/serial/xilinx_uartps.c
@@ -123,7 +123,7 @@ MODULE_PARM_DESC(rx_timeout, "Rx timeout, 1-255");
#define CDNS_UART_IXR_RXTRIG 0x00000001 /* RX FIFO trigger interrupt */
#define CDNS_UART_IXR_RXFULL 0x00000004 /* RX FIFO full interrupt. */
#define CDNS_UART_IXR_RXEMPTY 0x00000002 /* RX FIFO empty interrupt. */
-#define CDNS_UART_IXR_MASK 0x00001FFF /* Valid bit mask */
+#define CDNS_UART_IXR_RXMASK 0x000021e7 /* Valid RX bit mask */
/*
* Do not enable parity error interrupt for the following
@@ -364,7 +364,7 @@ static irqreturn_t cdns_uart_isr(int irq, void *dev_id)
cdns_uart_handle_tx(dev_id);
isrstatus &= ~CDNS_UART_IXR_TXEMPTY;
}
- if (isrstatus & CDNS_UART_IXR_MASK)
+ if (isrstatus & CDNS_UART_IXR_RXMASK)
cdns_uart_handle_rx(dev_id, isrstatus);
spin_unlock(&port->lock);
--
2.20.1
This is a note to let you know that I've just added the patch titled
usb: r8a66597: Fix a possible concurrency use-after-free bug in
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the usb-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From c85400f886e3d41e69966470879f635a2b50084c Mon Sep 17 00:00:00 2001
From: Jia-Ju Bai <baijiaju1990(a)gmail.com>
Date: Tue, 18 Dec 2018 20:04:25 +0800
Subject: usb: r8a66597: Fix a possible concurrency use-after-free bug in
r8a66597_endpoint_disable()
The function r8a66597_endpoint_disable() and r8a66597_urb_enqueue() may
be concurrently executed.
The two functions both access a possible shared variable "hep->hcpriv".
This shared variable is freed by r8a66597_endpoint_disable() via the
call path:
r8a66597_endpoint_disable
kfree(hep->hcpriv) (line 1995 in Linux-4.19)
This variable is read by r8a66597_urb_enqueue() via the call path:
r8a66597_urb_enqueue
spin_lock_irqsave(&r8a66597->lock)
init_pipe_info
enable_r8a66597_pipe
pipe = hep->hcpriv (line 802 in Linux-4.19)
The read operation is protected by a spinlock, but the free operation
is not protected by this spinlock, thus a concurrency use-after-free bug
may occur.
To fix this bug, the spin-lock and spin-unlock function calls in
r8a66597_endpoint_disable() are moved to protect the free operation.
Signed-off-by: Jia-Ju Bai <baijiaju1990(a)gmail.com>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/r8a66597-hcd.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/r8a66597-hcd.c b/drivers/usb/host/r8a66597-hcd.c
index 984892dd72f5..42668aeca57c 100644
--- a/drivers/usb/host/r8a66597-hcd.c
+++ b/drivers/usb/host/r8a66597-hcd.c
@@ -1979,6 +1979,8 @@ static int r8a66597_urb_dequeue(struct usb_hcd *hcd, struct urb *urb,
static void r8a66597_endpoint_disable(struct usb_hcd *hcd,
struct usb_host_endpoint *hep)
+__acquires(r8a66597->lock)
+__releases(r8a66597->lock)
{
struct r8a66597 *r8a66597 = hcd_to_r8a66597(hcd);
struct r8a66597_pipe *pipe = (struct r8a66597_pipe *)hep->hcpriv;
@@ -1991,13 +1993,14 @@ static void r8a66597_endpoint_disable(struct usb_hcd *hcd,
return;
pipenum = pipe->info.pipenum;
+ spin_lock_irqsave(&r8a66597->lock, flags);
if (pipenum == 0) {
kfree(hep->hcpriv);
hep->hcpriv = NULL;
+ spin_unlock_irqrestore(&r8a66597->lock, flags);
return;
}
- spin_lock_irqsave(&r8a66597->lock, flags);
pipe_stop(r8a66597, pipe);
pipe_irq_disable(r8a66597, pipenum);
disable_irq_empty(r8a66597, pipenum);
--
2.20.1
On Mon, Oct 15, 2018 at 06:54:31AM -0700, Omer Tripp wrote:
> Hi Greg and all,
>
> Here is my analysis of the complete gadget, and looking forward to your
> corrections/feedback if there are any inaccuracies:
>
>
> 1.
>
> __close_fd() is reachable via the close() syscall with a user-controlled
> fd.
> 2.
>
> If said bounds check is mispredicted, then a user-controlled address
> fdt->fd[fd] is obtained then dereferenced, and the value of a
> user-controlled address is loaded into the local variable file.
> 3.
>
> file is then passed as an argument to filp_close, where the cache
> lines secret
> + offsetof(f_op) and secret + offsetof(f_mode) are hot and vulnerable to
> a timing channel attack.
>
>
> The mitigation proposed by Greg Hackmann blocks this gadget.
What ever happened to this patch? Did it get reposted? If not, can
someone please do so with this text in the changelog?
thanks,
greg k-h
At least old Xen net backends seem to send frags with no real data
sometimes. In case such a fragment happens to occur with the frag limit
already reached the frontend will BUG currently even if this situation
is easily recoverable.
Modify the BUG_ON() condition accordingly.
Cc: stable(a)vger.kernel.org
Tested-by: Dietmar Hahn <dietmar.hahn(a)ts.fujitsu.com>
Signed-off-by: Juergen Gross <jgross(a)suse.com>
---
drivers/net/xen-netfront.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index f17f602e6171..5b97cc946d70 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -905,7 +905,7 @@ static RING_IDX xennet_fill_frags(struct netfront_queue *queue,
if (skb_shinfo(skb)->nr_frags == MAX_SKB_FRAGS) {
unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
- BUG_ON(pull_to <= skb_headlen(skb));
+ BUG_ON(pull_to < skb_headlen(skb));
__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
}
if (unlikely(skb_shinfo(skb)->nr_frags >= MAX_SKB_FRAGS)) {
--
2.16.4
commit d57f9da890696af1484f4a47f7f123560197865a upstream.
struct bioctx includes the ref refcount_t to track the number of I/O
fragments used to process a target BIO as well as ensure that the zone
of the BIO is kept in the active state throughout the lifetime of the
BIO. However, since decrementing of this reference count is done in the
target .end_io method, the function bio_endio() must be called multiple
times for read and write target BIOs, which causes problems with the
value of the __bi_remaining struct bio field for chained BIOs (e.g. the
clone BIO passed by dm core is large and splits into fragments by the
block layer), resulting in incorrect values and inconsistencies with the
BIO_CHAIN flag setting. This is turn triggers the BUG_ON() call:
BUG_ON(atomic_read(&bio->__bi_remaining) <= 0);
in bio_remaining_done() called from bio_endio().
Fix this ensuring that bio_endio() is called only once for any target
BIO by always using internal clone BIOs for processing any read or
write target BIO. This allows reference counting using the target BIO
context counter to trigger the target BIO completion bio_endio() call
once all data, metadata and other zone work triggered by the BIO
complete.
Overall, this simplifies the code too as the target .end_io becomes
unnecessary and differences between read and write BIO issuing and
completion processing disappear.
Fixes: 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target")
Cc: stable(a)vger.kernel.org #4.14
Signed-off-by: Damien Le Moal <damien.lemoal(a)wdc.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
---
drivers/md/dm-zoned-target.c | 122 +++++++++++------------------------
1 file changed, 38 insertions(+), 84 deletions(-)
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index ba6b0a90ecfb..532bfce7f072 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -20,7 +20,6 @@ struct dmz_bioctx {
struct dm_zone *zone;
struct bio *bio;
atomic_t ref;
- blk_status_t status;
};
/*
@@ -78,65 +77,66 @@ static inline void dmz_bio_endio(struct bio *bio, blk_status_t status)
{
struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
- if (bioctx->status == BLK_STS_OK && status != BLK_STS_OK)
- bioctx->status = status;
- bio_endio(bio);
+ if (status != BLK_STS_OK && bio->bi_status == BLK_STS_OK)
+ bio->bi_status = status;
+
+ if (atomic_dec_and_test(&bioctx->ref)) {
+ struct dm_zone *zone = bioctx->zone;
+
+ if (zone) {
+ if (bio->bi_status != BLK_STS_OK &&
+ bio_op(bio) == REQ_OP_WRITE &&
+ dmz_is_seq(zone))
+ set_bit(DMZ_SEQ_WRITE_ERR, &zone->flags);
+ dmz_deactivate_zone(zone);
+ }
+ bio_endio(bio);
+ }
}
/*
- * Partial clone read BIO completion callback. This terminates the
+ * Completion callback for an internally cloned target BIO. This terminates the
* target BIO when there are no more references to its context.
*/
-static void dmz_read_bio_end_io(struct bio *bio)
+static void dmz_clone_endio(struct bio *clone)
{
- struct dmz_bioctx *bioctx = bio->bi_private;
- blk_status_t status = bio->bi_status;
+ struct dmz_bioctx *bioctx = clone->bi_private;
+ blk_status_t status = clone->bi_status;
- bio_put(bio);
+ bio_put(clone);
dmz_bio_endio(bioctx->bio, status);
}
/*
- * Issue a BIO to a zone. The BIO may only partially process the
+ * Issue a clone of a target BIO. The clone may only partially process the
* original target BIO.
*/
-static int dmz_submit_read_bio(struct dmz_target *dmz, struct dm_zone *zone,
- struct bio *bio, sector_t chunk_block,
- unsigned int nr_blocks)
+static int dmz_submit_bio(struct dmz_target *dmz, struct dm_zone *zone,
+ struct bio *bio, sector_t chunk_block,
+ unsigned int nr_blocks)
{
struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
- sector_t sector;
struct bio *clone;
- /* BIO remap sector */
- sector = dmz_start_sect(dmz->metadata, zone) + dmz_blk2sect(chunk_block);
-
- /* If the read is not partial, there is no need to clone the BIO */
- if (nr_blocks == dmz_bio_blocks(bio)) {
- /* Setup and submit the BIO */
- bio->bi_iter.bi_sector = sector;
- atomic_inc(&bioctx->ref);
- generic_make_request(bio);
- return 0;
- }
-
- /* Partial BIO: we need to clone the BIO */
clone = bio_clone_fast(bio, GFP_NOIO, dmz->bio_set);
if (!clone)
return -ENOMEM;
- /* Setup the clone */
- clone->bi_iter.bi_sector = sector;
+ bio_set_dev(clone, dmz->dev->bdev);
+ clone->bi_iter.bi_sector =
+ dmz_start_sect(dmz->metadata, zone) + dmz_blk2sect(chunk_block);
clone->bi_iter.bi_size = dmz_blk2sect(nr_blocks) << SECTOR_SHIFT;
- clone->bi_end_io = dmz_read_bio_end_io;
+ clone->bi_end_io = dmz_clone_endio;
clone->bi_private = bioctx;
bio_advance(bio, clone->bi_iter.bi_size);
- /* Submit the clone */
atomic_inc(&bioctx->ref);
generic_make_request(clone);
+ if (bio_op(bio) == REQ_OP_WRITE && dmz_is_seq(zone))
+ zone->wp_block += nr_blocks;
+
return 0;
}
@@ -214,7 +214,7 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
if (nr_blocks) {
/* Valid blocks found: read them */
nr_blocks = min_t(unsigned int, nr_blocks, end_block - chunk_block);
- ret = dmz_submit_read_bio(dmz, rzone, bio, chunk_block, nr_blocks);
+ ret = dmz_submit_bio(dmz, rzone, bio, chunk_block, nr_blocks);
if (ret)
return ret;
chunk_block += nr_blocks;
@@ -228,25 +228,6 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
return 0;
}
-/*
- * Issue a write BIO to a zone.
- */
-static void dmz_submit_write_bio(struct dmz_target *dmz, struct dm_zone *zone,
- struct bio *bio, sector_t chunk_block,
- unsigned int nr_blocks)
-{
- struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
-
- /* Setup and submit the BIO */
- bio_set_dev(bio, dmz->dev->bdev);
- bio->bi_iter.bi_sector = dmz_start_sect(dmz->metadata, zone) + dmz_blk2sect(chunk_block);
- atomic_inc(&bioctx->ref);
- generic_make_request(bio);
-
- if (dmz_is_seq(zone))
- zone->wp_block += nr_blocks;
-}
-
/*
* Write blocks directly in a data zone, at the write pointer.
* If a buffer zone is assigned, invalidate the blocks written
@@ -265,7 +246,9 @@ static int dmz_handle_direct_write(struct dmz_target *dmz,
return -EROFS;
/* Submit write */
- dmz_submit_write_bio(dmz, zone, bio, chunk_block, nr_blocks);
+ ret = dmz_submit_bio(dmz, zone, bio, chunk_block, nr_blocks);
+ if (ret)
+ return ret;
/*
* Validate the blocks in the data zone and invalidate
@@ -301,7 +284,9 @@ static int dmz_handle_buffered_write(struct dmz_target *dmz,
return -EROFS;
/* Submit write */
- dmz_submit_write_bio(dmz, bzone, bio, chunk_block, nr_blocks);
+ ret = dmz_submit_bio(dmz, bzone, bio, chunk_block, nr_blocks);
+ if (ret)
+ return ret;
/*
* Validate the blocks in the buffer zone
@@ -600,7 +585,6 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
bioctx->zone = NULL;
bioctx->bio = bio;
atomic_set(&bioctx->ref, 1);
- bioctx->status = BLK_STS_OK;
/* Set the BIO pending in the flush list */
if (!nr_sectors && bio_op(bio) == REQ_OP_WRITE) {
@@ -623,35 +607,6 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
return DM_MAPIO_SUBMITTED;
}
-/*
- * Completed target BIO processing.
- */
-static int dmz_end_io(struct dm_target *ti, struct bio *bio, blk_status_t *error)
-{
- struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
-
- if (bioctx->status == BLK_STS_OK && *error)
- bioctx->status = *error;
-
- if (!atomic_dec_and_test(&bioctx->ref))
- return DM_ENDIO_INCOMPLETE;
-
- /* Done */
- bio->bi_status = bioctx->status;
-
- if (bioctx->zone) {
- struct dm_zone *zone = bioctx->zone;
-
- if (*error && bio_op(bio) == REQ_OP_WRITE) {
- if (dmz_is_seq(zone))
- set_bit(DMZ_SEQ_WRITE_ERR, &zone->flags);
- }
- dmz_deactivate_zone(zone);
- }
-
- return DM_ENDIO_DONE;
-}
-
/*
* Get zoned device information.
*/
@@ -946,7 +901,6 @@ static struct target_type dmz_type = {
.ctr = dmz_ctr,
.dtr = dmz_dtr,
.map = dmz_map,
- .end_io = dmz_end_io,
.io_hints = dmz_io_hints,
.prepare_ioctl = dmz_prepare_ioctl,
.postsuspend = dmz_suspend,
--
2.19.2
On Wed, 12 Dec 2018, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen(a)linux.intel.com>
>
> Memory protection key behavior should be the same in a child as it was
> in the parent before a fork. But, there is a bug that resets the
> state in the child at fork instead of preserving it.
>
> Our creation of new mm's is a bit convoluted. At fork(), the code
> does:
>
> 1. memcpy() the parent mm to initialize child
> 2. mm_init() to initalize some select stuff stuff
> 3. dup_mmap() to create true copies that memcpy()
> did not do right.
>
> For pkeys, we need to preserve two bits of state across a fork:
> 'execute_only_pkey' and 'pkey_allocation_map'. Those are preserved by
> the memcpy(), which I thought did the right thing. But, mm_init()
> calls init_new_context(), which I thought was *only* for execve()-time
> and overwrites 'execute_only_pkey' and 'pkey_allocation_map' with
> "new" values. But, alas, init_new_context() is used at execve() and
> fork().
>
> The result is that, after a fork(), the child's pkey state ends up
> looking like it does after an execve(), which is totally wrong. pkeys
> that are already allocated can be allocated again, for instance.
>
> To fix this, add code called by dup_mmap() to copy the pkey state from
> parent to child explicitly. Also add a comment above init_new_context()
> to make it more clear to the next poor sod what this code is used for.
>
> Fixes: e8c24d3a23a ("x86/pkeys: Allocation/free syscalls")
> Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
> Cc: Thomas Gleixner <tglx(a)linutronix.de>
> Cc: Ingo Molnar <mingo(a)redhat.com>
> Cc: Borislav Petkov <bp(a)alien8.de>
> Cc: "H. Peter Anvin" <hpa(a)zytor.com>
> Cc: x86(a)kernel.org
> Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
> Cc: Peter Zijlstra <peterz(a)infradead.org>
> Cc: Michael Ellerman <mpe(a)ellerman.id.au>
> Cc: Will Deacon <will.deacon(a)arm.com>
> Cc: Andy Lutomirski <luto(a)kernel.org>
> Cc: Joerg Roedel <jroedel(a)suse.de>
> Cc: stable(a)vger.kernel.org
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
[ resending without broken headers ]
this is a backport of commit 7aa54be297655 ("locking/qspinlock, x86:
Provide liveness guarantee") for the v4.9 stable tree.
For the v4.4 tree the ARCH_USE_QUEUED_SPINLOCKS option got disabled on
x86.
For v4.9 it has been decided to do a minimal backport of the final fix
(including all its dependencies).
With this backport I can't reproduce the issue in the latest v4.9-RT
tree. I was able to boot (and use) an arm64 box with these patches so it
is not broken in an abvious way.
Sebastian
hugetlbfs page faults can race with truncate and hole punch operations.
Current code in the page fault path attempts to handle this by 'backing
out' operations if we encounter the race. One obvious omission in the
current code is removing a page newly added to the page cache. This is
pretty straight forward to address, but there is a more subtle and
difficult issue of backing out hugetlb reservations. To handle this
correctly, the 'reservation state' before page allocation needs to be
noted so that it can be properly backed out. There are four distinct
possibilities for reservation state: shared/reserved, shared/no-resv,
private/reserved and private/no-resv. Backing out a reservation may
require memory allocation which could fail so that needs to be taken
into account as well.
Instead of writing the required complicated code for this rare
occurrence, just eliminate the race. i_mmap_rwsem is now held in read
mode for the duration of page fault processing. Hold i_mmap_rwsem
longer in truncation and hold punch code to cover the call to
remove_inode_hugepages.
Cc: <stable(a)vger.kernel.org>
Fixes: ebed4bfc8da8 ("hugetlb: fix absurd HugePages_Rsvd")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
---
fs/hugetlbfs/inode.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 32920a10100e..3244147fc42b 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -505,8 +505,8 @@ static int hugetlb_vmtruncate(struct inode *inode, loff_t offset)
i_mmap_lock_write(mapping);
if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root))
hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0);
- i_mmap_unlock_write(mapping);
remove_inode_hugepages(inode, offset, LLONG_MAX);
+ i_mmap_unlock_write(mapping);
return 0;
}
@@ -540,8 +540,8 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
hugetlb_vmdelete_list(&mapping->i_mmap,
hole_start >> PAGE_SHIFT,
hole_end >> PAGE_SHIFT);
- i_mmap_unlock_write(mapping);
remove_inode_hugepages(inode, hole_start, hole_end);
+ i_mmap_unlock_write(mapping);
inode_unlock(inode);
}
--
2.17.2
this is a backport of commit 7aa54be297655 ("locking/qspinlock, x86:
Provide liveness guarantee") for the v4.19 stable tree.
Initially I assumed that this was merged late in v4.19-rc but actually
it is just part v4.20-rc1.
For v4.19, most things are already in the tree. The GEN_BINARY_RMWcc
macro is still "old" and I skipped the documentation update.
Sebastian
The patch titled
Subject: hugetlbfs: remove unnecessary code after i_mmap_rwsem synchronization
has been removed from the -mm tree. Its filename was
hugetlbfs-remove-unnecessary-code-after-i_mmap_rwsem-synchronization.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlbfs: remove unnecessary code after i_mmap_rwsem synchronization
After expanding i_mmap_rwsem use for better shared pmd and page fault/
truncation synchronization, remove code that is no longer necessary.
Link: http://lkml.kernel.org/r/20181203200850.6460-4-mike.kravetz@oracle.com
Fixes: ebed4bfc8da8 ("hugetlb: fix absurd HugePages_Rsvd")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar(a)linux.vnet.ibm.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Prakash Sangappa <prakash.sangappa(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 46 +++++++++++++----------------------------
mm/hugetlb.c | 21 ++++++++----------
2 files changed, 25 insertions(+), 42 deletions(-)
--- a/fs/hugetlbfs/inode.c~hugetlbfs-remove-unnecessary-code-after-i_mmap_rwsem-synchronization
+++ a/fs/hugetlbfs/inode.c
@@ -383,17 +383,16 @@ hugetlb_vmdelete_list(struct rb_root_cac
* truncation is indicated by end of range being LLONG_MAX
* In this case, we first scan the range and release found pages.
* After releasing pages, hugetlb_unreserve_pages cleans up region/reserv
- * maps and global counts. Page faults can not race with truncation
- * in this routine. hugetlb_no_page() prevents page faults in the
- * truncated range. It checks i_size before allocation, and again after
- * with the page table lock for the page held. The same lock must be
- * acquired to unmap a page.
+ * maps and global counts.
* hole punch is indicated if end is not LLONG_MAX
* In the hole punch case we scan the range and release found pages.
* Only when releasing a page is the associated region/reserv map
* deleted. The region/reserv map for ranges without associated
- * pages are not modified. Page faults can race with hole punch.
- * This is indicated if we find a mapped page.
+ * pages are not modified.
+ *
+ * Callers of this routine must hold the i_mmap_rwsem in write mode to prevent
+ * races with page faults.
+ *
* Note: If the passed end of range value is beyond the end of file, but
* not LLONG_MAX this routine still performs a hole punch operation.
*/
@@ -423,32 +422,14 @@ static void remove_inode_hugepages(struc
for (i = 0; i < pagevec_count(&pvec); ++i) {
struct page *page = pvec.pages[i];
- u32 hash;
index = page->index;
- hash = hugetlb_fault_mutex_hash(h, current->mm,
- &pseudo_vma,
- mapping, index, 0);
- mutex_lock(&hugetlb_fault_mutex_table[hash]);
-
/*
- * If page is mapped, it was faulted in after being
- * unmapped in caller. Unmap (again) now after taking
- * the fault mutex. The mutex will prevent faults
- * until we finish removing the page.
- *
- * This race can only happen in the hole punch case.
- * Getting here in a truncate operation is a bug.
+ * A mapped page is impossible as callers should unmap
+ * all references before calling. And, i_mmap_rwsem
+ * prevents the creation of additional mappings.
*/
- if (unlikely(page_mapped(page))) {
- BUG_ON(truncate_op);
-
- i_mmap_lock_write(mapping);
- hugetlb_vmdelete_list(&mapping->i_mmap,
- index * pages_per_huge_page(h),
- (index + 1) * pages_per_huge_page(h));
- i_mmap_unlock_write(mapping);
- }
+ VM_BUG_ON(page_mapped(page));
lock_page(page);
/*
@@ -470,7 +451,6 @@ static void remove_inode_hugepages(struc
}
unlock_page(page);
- mutex_unlock(&hugetlb_fault_mutex_table[hash]);
}
huge_pagevec_release(&pvec);
cond_resched();
@@ -624,7 +604,11 @@ static long hugetlbfs_fallocate(struct f
/* addr is the offset within the file (zero based) */
addr = index * hpage_size;
- /* mutex taken here, fault path and hole punch */
+ /*
+ * fault mutex taken here, protects against fault path
+ * and hole punch. inode_lock previously taken protects
+ * against truncation.
+ */
hash = hugetlb_fault_mutex_hash(h, mm, &pseudo_vma, mapping,
index, addr);
mutex_lock(&hugetlb_fault_mutex_table[hash]);
--- a/mm/hugetlb.c~hugetlbfs-remove-unnecessary-code-after-i_mmap_rwsem-synchronization
+++ a/mm/hugetlb.c
@@ -3761,16 +3761,16 @@ static vm_fault_t hugetlb_no_page(struct
}
/*
- * Use page lock to guard against racing truncation
- * before we get page_table_lock.
+ * We can not race with truncation due to holding i_mmap_rwsem.
+ * Check once here for faults beyond end of file.
*/
+ size = i_size_read(mapping->host) >> huge_page_shift(h);
+ if (idx >= size)
+ goto out;
+
retry:
page = find_lock_page(mapping, idx);
if (!page) {
- size = i_size_read(mapping->host) >> huge_page_shift(h);
- if (idx >= size)
- goto out;
-
/*
* Check for page in userfault range
*/
@@ -3860,9 +3860,6 @@ retry:
}
ptl = huge_pte_lock(h, mm, ptep);
- size = i_size_read(mapping->host) >> huge_page_shift(h);
- if (idx >= size)
- goto backout;
ret = 0;
if (!huge_pte_none(huge_ptep_get(ptep)))
@@ -3965,8 +3962,10 @@ vm_fault_t hugetlb_fault(struct mm_struc
/*
* Acquire i_mmap_rwsem before calling huge_pte_alloc and hold
- * until finished with ptep. This prevents huge_pmd_unshare from
- * being called elsewhere and making the ptep no longer valid.
+ * until finished with ptep. This serves two purposes:
+ * 1) It prevents huge_pmd_unshare from being called elsewhere
+ * and making the ptep no longer valid.
+ * 2) It synchronizes us with file truncation.
*
* ptep could have already be assigned via huge_pte_offset. That
* is OK, as huge_pte_alloc will return the same value unless
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
The patch titled
Subject: hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race
has been removed from the -mm tree. Its filename was
hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race
hugetlbfs page faults can race with truncate and hole punch operations.
Current code in the page fault path attempts to handle this by 'backing
out' operations if we encounter the race. One obvious omission in the
current code is removing a page newly added to the page cache. This is
pretty straight forward to address, but there is a more subtle and
difficult issue of backing out hugetlb reservations. To handle this
correctly, the 'reservation state' before page allocation needs to be
noted so that it can be properly backed out. There are four distinct
possibilities for reservation state: shared/reserved, shared/no-resv,
private/reserved and private/no-resv. Backing out a reservation may
require memory allocation which could fail so that needs to be taken into
account as well.
Instead of writing the required complicated code for this rare occurrence,
just eliminate the race. i_mmap_rwsem is now held in read mode for the
duration of page fault processing. Hold i_mmap_rwsem longer in truncation
and hold punch code to cover the call to remove_inode_hugepages.
Link: http://lkml.kernel.org/r/20181203200850.6460-3-mike.kravetz@oracle.com
Fixes: ebed4bfc8da8 ("hugetlb: fix absurd HugePages_Rsvd")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar(a)linux.vnet.ibm.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Prakash Sangappa <prakash.sangappa(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/hugetlbfs/inode.c~hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race
+++ a/fs/hugetlbfs/inode.c
@@ -505,8 +505,8 @@ static int hugetlb_vmtruncate(struct ino
i_mmap_lock_write(mapping);
if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root))
hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0);
- i_mmap_unlock_write(mapping);
remove_inode_hugepages(inode, offset, LLONG_MAX);
+ i_mmap_unlock_write(mapping);
return 0;
}
@@ -540,8 +540,8 @@ static long hugetlbfs_punch_hole(struct
hugetlb_vmdelete_list(&mapping->i_mmap,
hole_start >> PAGE_SHIFT,
hole_end >> PAGE_SHIFT);
- i_mmap_unlock_write(mapping);
remove_inode_hugepages(inode, hole_start, hole_end);
+ i_mmap_unlock_write(mapping);
inode_unlock(inode);
}
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
hugetlbfs-remove-unnecessary-code-after-i_mmap_rwsem-synchronization.patch
The patch titled
Subject: hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
has been removed from the -mm tree. Its filename was
hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
Patch series "hugetlbfs: use i_mmap_rwsem for better synchronization".
These patches are a follow up to the RFC,
http://lkml.kernel.org/r/20181024045053.1467-1-mike.kravetz@oracle.com
Comments made by Naoya were addressed.
There are two primary issues addressed here:
1) For shared pmds, huge PE pointers returned by huge_pte_alloc can
become invalid via a call to huge_pmd_unshare by another thread.
2) hugetlbfs page faults can race with truncation causing invalid
global reserve counts and state.
Both issues are addressed by expanding the use of i_mmap_rwsem.
These issues have existed for a long time. They can be recreated with a
test program that causes page fault/truncation races. For simple
mappings, this results in a negative HugePages_Rsvd count. If racing with
mappings that contain shared pmds, we can hit "BUG at
fs/hugetlbfs/inode.c:444!" or Oops! as the result of an invalid memory
reference.
I broke up the larger RFC into separate patches addressing each issue.
Hopefully, this is easier to understand/review.
This patch (of 3):
While looking at BUGs associated with invalid huge page map counts, it was
discovered and observed that a huge pte pointer could become 'invalid' and
point to another task's page table. Consider the following:
A task takes a page fault on a shared hugetlbfs file and calls
huge_pte_alloc to get a ptep. Suppose the returned ptep points to a
shared pmd.
Now, another task truncates the hugetlbfs file. As part of truncation, it
unmaps everyone who has the file mapped. If the range being truncated is
covered by a shared pmd, huge_pmd_unshare will be called. For all but the
last user of the shared pmd, huge_pmd_unshare will clear the pud pointing
to the pmd. If the task in the middle of the page fault is not the last
user, the ptep returned by huge_pte_alloc now points to another task's
page table or worse. This leads to bad things such as incorrect page
map/reference counts or invalid memory references.
To fix, expand the use of i_mmap_rwsem as follows:
- i_mmap_rwsem is held in read mode whenever huge_pmd_share is called.
huge_pmd_share is only called via huge_pte_alloc, so callers of
huge_pte_alloc take i_mmap_rwsem before calling. In addition, callers
of huge_pte_alloc continue to hold the semaphore until finished with the
ptep.
- i_mmap_rwsem is held in write mode whenever huge_pmd_unshare is called.
Link: http://lkml.kernel.org/r/20181203200850.6460-2-mike.kravetz@oracle.com
Fixes: 39dde65c9940 ("shared page table for hugetlb page")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar(a)linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Prakash Sangappa <prakash.sangappa(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 70 ++++++++++++++++++++++++++++++++----------
mm/memory-failure.c | 14 +++++++-
mm/migrate.c | 13 +++++++
mm/rmap.c | 3 +
mm/userfaultfd.c | 11 +++++-
5 files changed, 91 insertions(+), 20 deletions(-)
--- a/mm/hugetlb.c~hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization
+++ a/mm/hugetlb.c
@@ -3240,6 +3240,7 @@ int copy_hugetlb_page_range(struct mm_st
int cow;
struct hstate *h = hstate_vma(vma);
unsigned long sz = huge_page_size(h);
+ struct address_space *mapping = vma->vm_file->f_mapping;
unsigned long mmun_start; /* For mmu_notifiers */
unsigned long mmun_end; /* For mmu_notifiers */
int ret = 0;
@@ -3253,11 +3254,23 @@ int copy_hugetlb_page_range(struct mm_st
for (addr = vma->vm_start; addr < vma->vm_end; addr += sz) {
spinlock_t *src_ptl, *dst_ptl;
+
src_pte = huge_pte_offset(src, addr, sz);
if (!src_pte)
continue;
+
+ /*
+ * i_mmap_rwsem must be held to call huge_pte_alloc.
+ * Continue to hold until finished with dst_pte, otherwise
+ * it could go away if part of a shared pmd.
+ *
+ * Technically, i_mmap_rwsem is only needed in the non-cow
+ * case as cow mappings are not shared.
+ */
+ i_mmap_lock_read(mapping);
dst_pte = huge_pte_alloc(dst, addr, sz);
if (!dst_pte) {
+ i_mmap_unlock_read(mapping);
ret = -ENOMEM;
break;
}
@@ -3272,8 +3285,10 @@ int copy_hugetlb_page_range(struct mm_st
* after taking the lock below.
*/
dst_entry = huge_ptep_get(dst_pte);
- if ((dst_pte == src_pte) || !huge_pte_none(dst_entry))
+ if ((dst_pte == src_pte) || !huge_pte_none(dst_entry)) {
+ i_mmap_unlock_read(mapping);
continue;
+ }
dst_ptl = huge_pte_lock(h, dst, dst_pte);
src_ptl = huge_pte_lockptr(h, src, src_pte);
@@ -3322,6 +3337,8 @@ int copy_hugetlb_page_range(struct mm_st
}
spin_unlock(src_ptl);
spin_unlock(dst_ptl);
+
+ i_mmap_unlock_read(mapping);
}
if (cow)
@@ -3773,14 +3790,18 @@ retry:
};
/*
- * hugetlb_fault_mutex must be dropped before
- * handling userfault. Reacquire after handling
- * fault to make calling code simpler.
+ * hugetlb_fault_mutex and i_mmap_rwsem must be
+ * dropped before handling userfault. Reacquire
+ * after handling fault to make calling code simpler.
*/
hash = hugetlb_fault_mutex_hash(h, mm, vma, mapping,
idx, haddr);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
+ i_mmap_unlock_read(mapping);
+
ret = handle_userfault(&vmf, VM_UFFD_MISSING);
+
+ i_mmap_lock_read(mapping);
mutex_lock(&hugetlb_fault_mutex_table[hash]);
goto out;
}
@@ -3928,6 +3949,11 @@ vm_fault_t hugetlb_fault(struct mm_struc
ptep = huge_pte_offset(mm, haddr, huge_page_size(h));
if (ptep) {
+ /*
+ * Since we hold no locks, ptep could be stale. That is
+ * OK as we are only making decisions based on content and
+ * not actually modifying content here.
+ */
entry = huge_ptep_get(ptep);
if (unlikely(is_hugetlb_entry_migration(entry))) {
migration_entry_wait_huge(vma, mm, ptep);
@@ -3935,20 +3961,31 @@ vm_fault_t hugetlb_fault(struct mm_struc
} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
return VM_FAULT_HWPOISON_LARGE |
VM_FAULT_SET_HINDEX(hstate_index(h));
- } else {
- ptep = huge_pte_alloc(mm, haddr, huge_page_size(h));
- if (!ptep)
- return VM_FAULT_OOM;
}
+ /*
+ * Acquire i_mmap_rwsem before calling huge_pte_alloc and hold
+ * until finished with ptep. This prevents huge_pmd_unshare from
+ * being called elsewhere and making the ptep no longer valid.
+ *
+ * ptep could have already be assigned via huge_pte_offset. That
+ * is OK, as huge_pte_alloc will return the same value unless
+ * something changed.
+ */
mapping = vma->vm_file->f_mapping;
- idx = vma_hugecache_offset(h, vma, haddr);
+ i_mmap_lock_read(mapping);
+ ptep = huge_pte_alloc(mm, haddr, huge_page_size(h));
+ if (!ptep) {
+ i_mmap_unlock_read(mapping);
+ return VM_FAULT_OOM;
+ }
/*
* Serialize hugepage allocation and instantiation, so that we don't
* get spurious allocation failures if two CPUs race to instantiate
* the same page in the page cache.
*/
+ idx = vma_hugecache_offset(h, vma, haddr);
hash = hugetlb_fault_mutex_hash(h, mm, vma, mapping, idx, haddr);
mutex_lock(&hugetlb_fault_mutex_table[hash]);
@@ -4036,6 +4073,7 @@ out_ptl:
}
out_mutex:
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
+ i_mmap_unlock_read(mapping);
/*
* Generally it's safe to hold refcount during waiting page lock. But
* here we just wait to defer the next page fault to avoid busy loop and
@@ -4640,10 +4678,12 @@ void adjust_range_if_pmd_sharing_possibl
* Search for a shareable pmd page for hugetlb. In any case calls pmd_alloc()
* and returns the corresponding pte. While this is not necessary for the
* !shared pmd case because we can allocate the pmd later as well, it makes the
- * code much cleaner. pmd allocation is essential for the shared case because
- * pud has to be populated inside the same i_mmap_rwsem section - otherwise
- * racing tasks could either miss the sharing (see huge_pte_offset) or select a
- * bad pmd for sharing.
+ * code much cleaner.
+ *
+ * This routine must be called with i_mmap_rwsem held in at least read mode.
+ * For hugetlbfs, this prevents removal of any page table entries associated
+ * with the address space. This is important as we are setting up sharing
+ * based on existing page table entries (mappings).
*/
pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud)
{
@@ -4660,7 +4700,6 @@ pte_t *huge_pmd_share(struct mm_struct *
if (!vma_shareable(vma, addr))
return (pte_t *)pmd_alloc(mm, pud, addr);
- i_mmap_lock_write(mapping);
vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) {
if (svma == vma)
continue;
@@ -4690,7 +4729,6 @@ pte_t *huge_pmd_share(struct mm_struct *
spin_unlock(ptl);
out:
pte = (pte_t *)pmd_alloc(mm, pud, addr);
- i_mmap_unlock_write(mapping);
return pte;
}
@@ -4701,7 +4739,7 @@ out:
* indicated by page_count > 1, unmap is achieved by clearing pud and
* decrementing the ref count. If count == 1, the pte page is not shared.
*
- * called with page table lock held.
+ * Called with page table lock held and i_mmap_rwsem held in write mode.
*
* returns: 1 successfully unmapped a shared pte page
* 0 the underlying pte page is not shared, or it is the last user
--- a/mm/memory-failure.c~hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization
+++ a/mm/memory-failure.c
@@ -1028,7 +1028,19 @@ static bool hwpoison_user_mappings(struc
if (kill)
collect_procs(hpage, &tokill, flags & MF_ACTION_REQUIRED);
- unmap_success = try_to_unmap(hpage, ttu);
+ if (!PageHuge(hpage)) {
+ unmap_success = try_to_unmap(hpage, ttu);
+ } else {
+ /*
+ * For hugetlb pages, try_to_unmap could potentially call
+ * huge_pmd_unshare. Because of this, take semaphore in
+ * write mode here and set TTU_RMAP_LOCKED to indicate we
+ * have taken the lock at this higer level.
+ */
+ i_mmap_lock_write(mapping);
+ unmap_success = try_to_unmap(hpage, ttu|TTU_RMAP_LOCKED);
+ i_mmap_unlock_write(mapping);
+ }
if (!unmap_success)
pr_err("Memory failure: %#lx: failed to unmap page (mapcount=%d)\n",
pfn, page_mapcount(hpage));
--- a/mm/migrate.c~hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization
+++ a/mm/migrate.c
@@ -1297,8 +1297,19 @@ static int unmap_and_move_huge_page(new_
goto put_anon;
if (page_mapped(hpage)) {
+ struct address_space *mapping = page_mapping(hpage);
+
+ /*
+ * try_to_unmap could potentially call huge_pmd_unshare.
+ * Because of this, take semaphore in write mode here and
+ * set TTU_RMAP_LOCKED to let lower levels know we have
+ * taken the lock.
+ */
+ i_mmap_lock_write(mapping);
try_to_unmap(hpage,
- TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+ TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS|
+ TTU_RMAP_LOCKED);
+ i_mmap_unlock_write(mapping);
page_was_mapped = 1;
}
--- a/mm/rmap.c~hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization
+++ a/mm/rmap.c
@@ -1374,6 +1374,9 @@ static bool try_to_unmap_one(struct page
/*
* If sharing is possible, start and end will be adjusted
* accordingly.
+ *
+ * If called for a huge page, caller must hold i_mmap_rwsem
+ * in write mode as it is possible to call huge_pmd_unshare.
*/
adjust_range_if_pmd_sharing_possible(vma, &start, &end);
}
--- a/mm/userfaultfd.c~hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization
+++ a/mm/userfaultfd.c
@@ -267,10 +267,14 @@ retry:
VM_BUG_ON(dst_addr & ~huge_page_mask(h));
/*
- * Serialize via hugetlb_fault_mutex
+ * Serialize via i_mmap_rwsem and hugetlb_fault_mutex.
+ * i_mmap_rwsem ensures the dst_pte remains valid even
+ * in the case of shared pmds. fault mutex prevents
+ * races with other faulting threads.
*/
- idx = linear_page_index(dst_vma, dst_addr);
mapping = dst_vma->vm_file->f_mapping;
+ i_mmap_lock_read(mapping);
+ idx = linear_page_index(dst_vma, dst_addr);
hash = hugetlb_fault_mutex_hash(h, dst_mm, dst_vma, mapping,
idx, dst_addr);
mutex_lock(&hugetlb_fault_mutex_table[hash]);
@@ -279,6 +283,7 @@ retry:
dst_pte = huge_pte_alloc(dst_mm, dst_addr, huge_page_size(h));
if (!dst_pte) {
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
+ i_mmap_unlock_read(mapping);
goto out_unlock;
}
@@ -286,6 +291,7 @@ retry:
dst_pteval = huge_ptep_get(dst_pte);
if (!huge_pte_none(dst_pteval)) {
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
+ i_mmap_unlock_read(mapping);
goto out_unlock;
}
@@ -293,6 +299,7 @@ retry:
dst_addr, src_addr, &page);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
+ i_mmap_unlock_read(mapping);
vm_alloc_shared = vm_shared;
cond_resched();
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race.patch
hugetlbfs-remove-unnecessary-code-after-i_mmap_rwsem-synchronization.patch
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
In the case that a bprintk event has a dereferenced pointer that is
stored as a string, and there's more values to process (more args), the
arg was not updated to point to the next arg after processing the
dereferenced pointer, and it screwed up what was to be displayed.
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: linux-trace-devel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Fixes: 37db96bb49629 ("tools lib traceevent: Handle new pointer processing of bprint strings")
Link: http://lkml.kernel.org/r/20181210134522.3f71e2ca@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
---
tools/lib/traceevent/event-parse.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index a5ed291b8a9f..69a96e39f0ab 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -4973,6 +4973,7 @@ static void pretty_print(struct trace_seq *s, void *data, int size, struct tep_e
if (arg->type == TEP_PRINT_BSTRING) {
trace_seq_puts(s, arg->string.string);
+ arg = arg->next;
break;
}
--
2.19.2
From: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
commit 28a9a9e83ceae2cee25b9af9ad20d53aaa9ab951 upstream
Packet queue state is over used to determine SDMA descriptor
availablitity and packet queue request state.
cpu 0 ret = user_sdma_send_pkts(req, pcount);
cpu 0 if (atomic_read(&pq->n_reqs))
cpu 1 IRQ user_sdma_txreq_cb calls pq_update() (state to _INACTIVE)
cpu 0 xchg(&pq->state, SDMA_PKT_Q_ACTIVE);
At this point pq->n_reqs == 0 and pq->state is incorrectly
SDMA_PKT_Q_ACTIVE. The close path will hang waiting for the state
to return to _INACTIVE.
This can also change the state from _DEFERRED to _ACTIVE. However,
this is a mostly benign race.
Remove the racy code path.
Use n_reqs to determine if a packet queue is active or not.
Cc: <stable(a)vger.kernel.org> # 4.19.x
Reviewed-by: Mitko Haralanov <mitko.haralanov(a)intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn(a)intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro(a)intel.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
---
drivers/infiniband/hw/hfi1/user_sdma.c | 24 ++++++++++--------------
drivers/infiniband/hw/hfi1/user_sdma.h | 9 +++++----
2 files changed, 15 insertions(+), 18 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index 39134dd..51831bf 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -187,7 +187,6 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt,
pq->ctxt = uctxt->ctxt;
pq->subctxt = fd->subctxt;
pq->n_max_reqs = hfi1_sdma_comp_ring_size;
- pq->state = SDMA_PKT_Q_INACTIVE;
atomic_set(&pq->n_reqs, 0);
init_waitqueue_head(&pq->wait);
atomic_set(&pq->n_locked, 0);
@@ -276,7 +275,7 @@ int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd,
/* Wait until all requests have been freed. */
wait_event_interruptible(
pq->wait,
- (READ_ONCE(pq->state) == SDMA_PKT_Q_INACTIVE));
+ !atomic_read(&pq->n_reqs));
kfree(pq->reqs);
kfree(pq->req_in_use);
kmem_cache_destroy(pq->txreq_cache);
@@ -312,6 +311,13 @@ static u8 dlid_to_selector(u16 dlid)
return mapping[hash];
}
+/**
+ * hfi1_user_sdma_process_request() - Process and start a user sdma request
+ * @fd: valid file descriptor
+ * @iovec: array of io vectors to process
+ * @dim: overall iovec array size
+ * @count: number of io vector array entries processed
+ */
int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
struct iovec *iovec, unsigned long dim,
unsigned long *count)
@@ -560,21 +566,13 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
req->ahg_idx = sdma_ahg_alloc(req->sde);
set_comp_state(pq, cq, info.comp_idx, QUEUED, 0);
+ pq->state = SDMA_PKT_Q_ACTIVE;
/* Send the first N packets in the request to buy us some time */
ret = user_sdma_send_pkts(req, pcount);
if (unlikely(ret < 0 && ret != -EBUSY))
goto free_req;
/*
- * It is possible that the SDMA engine would have processed all the
- * submitted packets by the time we get here. Therefore, only set
- * packet queue state to ACTIVE if there are still uncompleted
- * requests.
- */
- if (atomic_read(&pq->n_reqs))
- xchg(&pq->state, SDMA_PKT_Q_ACTIVE);
-
- /*
* This is a somewhat blocking send implementation.
* The driver will block the caller until all packets of the
* request have been submitted to the SDMA engine. However, it
@@ -1409,10 +1407,8 @@ static void user_sdma_txreq_cb(struct sdma_txreq *txreq, int status)
static inline void pq_update(struct hfi1_user_sdma_pkt_q *pq)
{
- if (atomic_dec_and_test(&pq->n_reqs)) {
- xchg(&pq->state, SDMA_PKT_Q_INACTIVE);
+ if (atomic_dec_and_test(&pq->n_reqs))
wake_up(&pq->wait);
- }
}
static void user_sdma_free_request(struct user_sdma_request *req, bool unpin)
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h
index 0ae0645..91c343f 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.h
+++ b/drivers/infiniband/hw/hfi1/user_sdma.h
@@ -105,9 +105,10 @@ static inline int ahg_header_set(u32 *arr, int idx, size_t array_size,
#define TXREQ_FLAGS_REQ_ACK BIT(0) /* Set the ACK bit in the header */
#define TXREQ_FLAGS_REQ_DISABLE_SH BIT(1) /* Disable header suppression */
-#define SDMA_PKT_Q_INACTIVE BIT(0)
-#define SDMA_PKT_Q_ACTIVE BIT(1)
-#define SDMA_PKT_Q_DEFERRED BIT(2)
+enum pkt_q_sdma_state {
+ SDMA_PKT_Q_ACTIVE,
+ SDMA_PKT_Q_DEFERRED,
+};
/*
* Maximum retry attempts to submit a TX request
@@ -133,7 +134,7 @@ struct hfi1_user_sdma_pkt_q {
struct user_sdma_request *reqs;
unsigned long *req_in_use;
struct iowait busy;
- unsigned state;
+ enum pkt_q_sdma_state state;
wait_queue_head_t wait;
unsigned long unpinned;
struct mmu_rb_handler *handler;
From: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
commit 28a9a9e83ceae2cee25b9af9ad20d53aaa9ab951 upstream
Packet queue state is over used to determine SDMA descriptor
availablitity and packet queue request state.
cpu 0 ret = user_sdma_send_pkts(req, pcount);
cpu 0 if (atomic_read(&pq->n_reqs))
cpu 1 IRQ user_sdma_txreq_cb calls pq_update() (state to _INACTIVE)
cpu 0 xchg(&pq->state, SDMA_PKT_Q_ACTIVE);
At this point pq->n_reqs == 0 and pq->state is incorrectly
SDMA_PKT_Q_ACTIVE. The close path will hang waiting for the state
to return to _INACTIVE.
This can also change the state from _DEFERRED to _ACTIVE. However,
this is a mostly benign race.
Remove the racy code path.
Use n_reqs to determine if a packet queue is active or not.
Cc: <stable(a)vger.kernel.org> # 4.9.0
Reviewed-by: Mitko Haralanov <mitko.haralanov(a)intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn(a)intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
---
drivers/infiniband/hw/hfi1/user_sdma.c | 28 ++++++++++------------------
drivers/infiniband/hw/hfi1/user_sdma.h | 7 ++++++-
2 files changed, 16 insertions(+), 19 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index 619475c..4c11116 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -151,10 +151,6 @@
#define SDMA_REQ_HAVE_AHG 1
#define SDMA_REQ_HAS_ERROR 2
-#define SDMA_PKT_Q_INACTIVE BIT(0)
-#define SDMA_PKT_Q_ACTIVE BIT(1)
-#define SDMA_PKT_Q_DEFERRED BIT(2)
-
/*
* Maximum retry attempts to submit a TX request
* before putting the process to sleep.
@@ -408,7 +404,6 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt, struct file *fp)
pq->ctxt = uctxt->ctxt;
pq->subctxt = fd->subctxt;
pq->n_max_reqs = hfi1_sdma_comp_ring_size;
- pq->state = SDMA_PKT_Q_INACTIVE;
atomic_set(&pq->n_reqs, 0);
init_waitqueue_head(&pq->wait);
atomic_set(&pq->n_locked, 0);
@@ -491,7 +486,7 @@ int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd)
/* Wait until all requests have been freed. */
wait_event_interruptible(
pq->wait,
- (ACCESS_ONCE(pq->state) == SDMA_PKT_Q_INACTIVE));
+ !atomic_read(&pq->n_reqs));
kfree(pq->reqs);
kfree(pq->req_in_use);
kmem_cache_destroy(pq->txreq_cache);
@@ -527,6 +522,13 @@ static u8 dlid_to_selector(u16 dlid)
return mapping[hash];
}
+/**
+ * hfi1_user_sdma_process_request() - Process and start a user sdma request
+ * @fp: valid file pointer
+ * @iovec: array of io vectors to process
+ * @dim: overall iovec array size
+ * @count: number of io vector array entries processed
+ */
int hfi1_user_sdma_process_request(struct file *fp, struct iovec *iovec,
unsigned long dim, unsigned long *count)
{
@@ -768,21 +770,13 @@ int hfi1_user_sdma_process_request(struct file *fp, struct iovec *iovec,
}
set_comp_state(pq, cq, info.comp_idx, QUEUED, 0);
+ pq->state = SDMA_PKT_Q_ACTIVE;
/* Send the first N packets in the request to buy us some time */
ret = user_sdma_send_pkts(req, pcount);
if (unlikely(ret < 0 && ret != -EBUSY))
goto free_req;
/*
- * It is possible that the SDMA engine would have processed all the
- * submitted packets by the time we get here. Therefore, only set
- * packet queue state to ACTIVE if there are still uncompleted
- * requests.
- */
- if (atomic_read(&pq->n_reqs))
- xchg(&pq->state, SDMA_PKT_Q_ACTIVE);
-
- /*
* This is a somewhat blocking send implementation.
* The driver will block the caller until all packets of the
* request have been submitted to the SDMA engine. However, it
@@ -1526,10 +1520,8 @@ static void user_sdma_txreq_cb(struct sdma_txreq *txreq, int status)
static inline void pq_update(struct hfi1_user_sdma_pkt_q *pq)
{
- if (atomic_dec_and_test(&pq->n_reqs)) {
- xchg(&pq->state, SDMA_PKT_Q_INACTIVE);
+ if (atomic_dec_and_test(&pq->n_reqs))
wake_up(&pq->wait);
- }
}
static void user_sdma_free_request(struct user_sdma_request *req, bool unpin)
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h
index 3900171..09dd843 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.h
+++ b/drivers/infiniband/hw/hfi1/user_sdma.h
@@ -53,6 +53,11 @@
extern uint extended_psn;
+enum pkt_q_sdma_state {
+ SDMA_PKT_Q_ACTIVE,
+ SDMA_PKT_Q_DEFERRED,
+};
+
struct hfi1_user_sdma_pkt_q {
struct list_head list;
unsigned ctxt;
@@ -65,7 +70,7 @@ struct hfi1_user_sdma_pkt_q {
struct user_sdma_request *reqs;
unsigned long *req_in_use;
struct iowait busy;
- unsigned state;
+ enum pkt_q_sdma_state state;
wait_queue_head_t wait;
unsigned long unpinned;
struct mmu_rb_handler *handler;
From: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
commit 28a9a9e83ceae2cee25b9af9ad20d53aaa9ab951 upstream
Packet queue state is over used to determine SDMA descriptor
availablitity and packet queue request state.
cpu 0 ret = user_sdma_send_pkts(req, pcount);
cpu 0 if (atomic_read(&pq->n_reqs))
cpu 1 IRQ user_sdma_txreq_cb calls pq_update() (state to _INACTIVE)
cpu 0 xchg(&pq->state, SDMA_PKT_Q_ACTIVE);
At this point pq->n_reqs == 0 and pq->state is incorrectly
SDMA_PKT_Q_ACTIVE. The close path will hang waiting for the state
to return to _INACTIVE.
This can also change the state from _DEFERRED to _ACTIVE. However,
this is a mostly benign race.
Remove the racy code path.
Use n_reqs to determine if a packet queue is active or not.
Cc: <stable(a)vger.kernel.org> # 4.14.0>
Reviewed-by: Mitko Haralanov <mitko.haralanov(a)intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn(a)intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
---
drivers/infiniband/hw/hfi1/user_sdma.c | 24 ++++++++++--------------
drivers/infiniband/hw/hfi1/user_sdma.h | 9 +++++----
2 files changed, 15 insertions(+), 18 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index c14ec04..cbe5ab2 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -187,7 +187,6 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt,
pq->ctxt = uctxt->ctxt;
pq->subctxt = fd->subctxt;
pq->n_max_reqs = hfi1_sdma_comp_ring_size;
- pq->state = SDMA_PKT_Q_INACTIVE;
atomic_set(&pq->n_reqs, 0);
init_waitqueue_head(&pq->wait);
atomic_set(&pq->n_locked, 0);
@@ -276,7 +275,7 @@ int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd,
/* Wait until all requests have been freed. */
wait_event_interruptible(
pq->wait,
- (ACCESS_ONCE(pq->state) == SDMA_PKT_Q_INACTIVE));
+ !atomic_read(&pq->n_reqs));
kfree(pq->reqs);
kfree(pq->req_in_use);
kmem_cache_destroy(pq->txreq_cache);
@@ -312,6 +311,13 @@ static u8 dlid_to_selector(u16 dlid)
return mapping[hash];
}
+/**
+ * hfi1_user_sdma_process_request() - Process and start a user sdma request
+ * @fd: valid file descriptor
+ * @iovec: array of io vectors to process
+ * @dim: overall iovec array size
+ * @count: number of io vector array entries processed
+ */
int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
struct iovec *iovec, unsigned long dim,
unsigned long *count)
@@ -560,21 +566,13 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
req->ahg_idx = sdma_ahg_alloc(req->sde);
set_comp_state(pq, cq, info.comp_idx, QUEUED, 0);
+ pq->state = SDMA_PKT_Q_ACTIVE;
/* Send the first N packets in the request to buy us some time */
ret = user_sdma_send_pkts(req, pcount);
if (unlikely(ret < 0 && ret != -EBUSY))
goto free_req;
/*
- * It is possible that the SDMA engine would have processed all the
- * submitted packets by the time we get here. Therefore, only set
- * packet queue state to ACTIVE if there are still uncompleted
- * requests.
- */
- if (atomic_read(&pq->n_reqs))
- xchg(&pq->state, SDMA_PKT_Q_ACTIVE);
-
- /*
* This is a somewhat blocking send implementation.
* The driver will block the caller until all packets of the
* request have been submitted to the SDMA engine. However, it
@@ -1391,10 +1389,8 @@ static void user_sdma_txreq_cb(struct sdma_txreq *txreq, int status)
static inline void pq_update(struct hfi1_user_sdma_pkt_q *pq)
{
- if (atomic_dec_and_test(&pq->n_reqs)) {
- xchg(&pq->state, SDMA_PKT_Q_INACTIVE);
+ if (atomic_dec_and_test(&pq->n_reqs))
wake_up(&pq->wait);
- }
}
static void user_sdma_free_request(struct user_sdma_request *req, bool unpin)
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h
index 5af5233..2b5326d 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.h
+++ b/drivers/infiniband/hw/hfi1/user_sdma.h
@@ -94,9 +94,10 @@
#define TXREQ_FLAGS_REQ_ACK BIT(0) /* Set the ACK bit in the header */
#define TXREQ_FLAGS_REQ_DISABLE_SH BIT(1) /* Disable header suppression */
-#define SDMA_PKT_Q_INACTIVE BIT(0)
-#define SDMA_PKT_Q_ACTIVE BIT(1)
-#define SDMA_PKT_Q_DEFERRED BIT(2)
+enum pkt_q_sdma_state {
+ SDMA_PKT_Q_ACTIVE,
+ SDMA_PKT_Q_DEFERRED,
+};
/*
* Maximum retry attempts to submit a TX request
@@ -124,7 +125,7 @@ struct hfi1_user_sdma_pkt_q {
struct user_sdma_request *reqs;
unsigned long *req_in_use;
struct iowait busy;
- unsigned state;
+ enum pkt_q_sdma_state state;
wait_queue_head_t wait;
unsigned long unpinned;
struct mmu_rb_handler *handler;
Hi Greg,
This was not marked for stable but seems it should be in stable. And the
second patch fixes the first one.
Please apply to your queue of 4.14-stable.
--
Regards
Sudip
Hello,
Please picked up this patch for linux 4.4 and 4.9.
This fixes CVE-2017-0605 (Rejected?). Tested in Debian ;)
Thank.
[ Upstream commit e09e28671cda63e6308b31798b997639120e2a21 ]
From: Amey Telawane <ameyt(a)codeaurora.org>
Date: Wed, 3 May 2017 15:41:14 +0530
Subject: [PATCH] tracing: Use strlcpy() instead of strcpy() in
__trace_find_cmdline()
Strcpy is inherently not safe, and strlcpy() should be used instead.
__trace_find_cmdline() uses strcpy() because the comms saved must have a
terminating nul character, but it doesn't hurt to add the extra protection
of using strlcpy() instead of strcpy().
Link: http://lkml.kernel.org/r/1493806274-13936-1-git-send-email-amit.pundir@lina…
Signed-off-by: Amey Telawane <ameyt(a)codeaurora.org>
[AmitP: Cherry-picked this commit from CodeAurora kernel/msm-3.10
https://source.codeaurora.org/quic/la/kernel/msm-3.10/commit/?id=2161ae9a70…]
Signed-off-by: Amit Pundir <amit.pundir(a)linaro.org>
[ Updated change log and removed the "- 1" from len parameter ]
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
kernel/trace/trace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1862,7 +1862,7 @@ static void __trace_find_cmdline(int pid
map = savedcmd->map_pid_to_cmdline[pid];
if (map != NO_CMDLINE_MAP)
- strcpy(comm, get_saved_cmdlines(map));
+ strlcpy(comm, get_saved_cmdlines(map), TASK_COMM_LEN);
else
strcpy(comm, "<...>");
}
Since commit d26c25a9d19b ("arm64: KVM: Tighten guest core register
access from userspace"), KVM_{GET,SET}_ONE_REG rejects register IDs
that do not correspond to a single underlying architectural register.
KVM_GET_REG_LIST was not changed to match however: instead, it
simply yields a list of 32-bit register IDs that together cover the
whole kvm_regs struct. This means that if userspace tries to use
the resulting list of IDs directly to drive calls to KVM_*_ONE_REG,
some of those calls will now fail.
This was not the intention. Instead, iterating KVM_*_ONE_REG over
the list of IDs returned by KVM_GET_REG_LIST should be guaranteed
to work.
This patch fixes the problem by splitting validate_core_reg_id()
into a backend core_reg_size_from_offset() which does all of the
work except for checking that the size field in the register ID
matches, and kvm_arm_copy_reg_indices() and num_core_regs() are
converted to use this to enumerate the valid offsets.
kvm_arm_copy_reg_indices() now also sets the register ID size field
appropriately based on the value returned, so the register ID
supplied to userspace is fully qualified for use with the register
access ioctls.
Cc: stable(a)vger.kernel.org
Fixes: d26c25a9d19b ("arm64: KVM: Tighten guest core register access from userspace")
Signed-off-by: Dave Martin <Dave.Martin(a)arm.com>
---
arch/arm64/kvm/guest.c | 61 ++++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 54 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index dd436a5..cbe423b 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -57,9 +57,8 @@ static u64 core_reg_offset_from_id(u64 id)
return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
}
-static int validate_core_offset(const struct kvm_one_reg *reg)
+static int core_reg_size_from_offset(u64 off)
{
- u64 off = core_reg_offset_from_id(reg->id);
int size;
switch (off) {
@@ -89,8 +88,21 @@ static int validate_core_offset(const struct kvm_one_reg *reg)
return -EINVAL;
}
- if (KVM_REG_SIZE(reg->id) == size &&
- IS_ALIGNED(off, size / sizeof(__u32)))
+ if (IS_ALIGNED(off, size / sizeof(__u32)))
+ return size;
+
+ return -EINVAL;
+}
+
+static int validate_core_offset(const struct kvm_one_reg *reg)
+{
+ u64 off = core_reg_offset_from_id(reg->id);
+ int size = core_reg_size_from_offset(off);
+
+ if (size < 0)
+ return -EINVAL;
+
+ if (KVM_REG_SIZE(reg->id) == size)
return 0;
return -EINVAL;
@@ -195,7 +207,19 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
static unsigned long num_core_regs(void)
{
- return sizeof(struct kvm_regs) / sizeof(__u32);
+ unsigned int i;
+ int n = 0;
+
+ for (i = 0; i < sizeof(struct kvm_regs) / sizeof(__u32); i++) {
+ int size = core_reg_size_from_offset(i);
+
+ if (size < 0)
+ continue;
+
+ n++;
+ }
+
+ return n;
}
/**
@@ -270,11 +294,34 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
{
unsigned int i;
- const u64 core_reg = KVM_REG_ARM64 | KVM_REG_SIZE_U64 | KVM_REG_ARM_CORE;
int ret;
for (i = 0; i < sizeof(struct kvm_regs) / sizeof(__u32); i++) {
- if (put_user(core_reg | i, uindices))
+ u64 reg = KVM_REG_ARM64 | KVM_REG_ARM_CORE | i;
+ int size = core_reg_size_from_offset(i);
+
+ if (size < 0)
+ continue;
+
+ switch (size) {
+ case sizeof(__u32):
+ reg |= KVM_REG_SIZE_U32;
+ break;
+
+ case sizeof(__u64):
+ reg |= KVM_REG_SIZE_U64;
+ break;
+
+ case sizeof(__uint128_t):
+ reg |= KVM_REG_SIZE_U128;
+ break;
+
+ default:
+ WARN_ON(1);
+ continue;
+ }
+
+ if (put_user(reg, uindices))
return -EFAULT;
uindices++;
}
--
2.1.4
Hi Greg,
I did some randconfig testing on linux-4.19 arm/arm64/x86. So far I needed
27 patches, most of which are also still needed in mainline Linux. I
had submitted some before, and others were not submitted previously
for some reason. I'll try to get those fixed in mainline and then
make sure we get them into 4.19 as well.
This series for now contains four patches that did make it into mainline:
2e6ae11dd0d1 ("slimbus: ngd: mark PM functions as __maybe_unused")
33f49571d750 ("staging: olpc_dcon: add a missing dependency")
0eeec01488da ("scsi: raid_attrs: fix unused variable warning")
11d4afd4ff66 ("sched/pelt: Fix warning and clean up IRQ PELT config")
Feel free to either cherry-pick those from mainline or apply the
patch from this series, whichever works best for you.
The other three patches are for warnings in code that got removed in
mainline kernels:
3e9efc3299dd ("i2c: aspeed: Handle master/slave combined irq events properly")
972910948fb6 ("ARM: dts: qcom: Remove Arrow SD600 eval board")
effec874792f ("drm/msm/dpu: Remove dpu_dbg")
My feeling was that it's safer to just address the warning by fixing
the code correctly in each of these cases, but if you disagree,
applying the mainline change should work equally well, so decide
for yourself.
Arnd
Arnd Bergmann (5):
scsi: raid_attrs: fix unused variable warning
slimbus: ngd: mark PM functions as __maybe_unused
[stable-4.19] i2c: aspeed: fix build warning
[stable-4.19] ARM: dts: qcom-apq8064-arrow-sd-600eval fix
graph_endpoint warning
[stable-4.19] drm/msm: fix address space warning
Lubomir Rintel (1):
staging: olpc_dcon: add a missing dependency
Vincent Guittot (1):
sched/pelt: Fix warning and clean up IRQ PELT config
arch/arm/boot/dts/qcom-apq8064-arrow-sd-600eval.dts | 5 +++++
drivers/gpu/drm/msm/disp/dpu1/dpu_dbg.c | 8 ++++----
drivers/i2c/busses/i2c-aspeed.c | 4 +++-
drivers/scsi/raid_class.c | 4 +---
drivers/slimbus/qcom-ngd-ctrl.c | 6 ++----
drivers/staging/olpc_dcon/Kconfig | 1 +
init/Kconfig | 5 +++++
kernel/sched/core.c | 7 +++----
kernel/sched/fair.c | 2 +-
kernel/sched/pelt.c | 2 +-
kernel/sched/pelt.h | 2 +-
kernel/sched/sched.h | 5 ++---
12 files changed, 29 insertions(+), 22 deletions(-)
Cc: Andrew Jeffery <andrew(a)aj.id.au>
Cc: Andy Gross <andy.gross(a)linaro.org>
Cc: bp(a)alien8.de
Cc: Daniel Drake <dsd(a)laptop.org>
Cc: David Brown <david.brown(a)linaro.org>
Cc: dou_liyang(a)163.com
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: "James E.J. Bottomley" <jejb(a)linux.vnet.ibm.com>
Cc: Jens Frederich <jfrederich(a)gmail.com>
Cc: Lubomir Rintel <lkundrak(a)v3.sk>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: "Martin K. Petersen" <martin.petersen(a)oracle.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rob Clark <robdclark(a)gmail.com>
Cc: Rob Herring <robh+dt(a)kernel.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Cc: Vincent Guittot <vincent.guittot(a)linaro.org>
Cc: linux-arm-msm(a)vger.kernel.org
Cc: devicetree(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: freedreno(a)lists.freedesktop.org
Cc: linux-i2c(a)vger.kernel.org
Cc: openbmc(a)lists.ozlabs.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-aspeed(a)lists.ozlabs.org
Cc: linux-scsi(a)vger.kernel.org
--
2.20.0
Mediatek Preloader is a proprietary embedded boot loader for loading
Little Kernel and Linux into device DRAM.
This boot loader also handle firmware update. Mediatek Preloader will be
enumerated as a virtual COM port when the device is connected to Windows
or Linux OS via CDC-ACM class driver. When the USB enumeration has been
done, Mediatek Preloader will send out handshake command "READY" to PC
actively instead of waiting command from the download tool.
Since Linux 4.12, the commit "tty: reset termios state on device
registration" (93857edd9829e144acb6c7e72d593f6e01aead66) causes Mediatek
Preloader receiving some abnoraml command like "READYXX" as it sent.
This will be recognized as an incorrect response. The behavior change
also causes the download handshake fail. This change only affects
subsequent connects if the reconnected device happens to get the same minor
number.
By disabling the ECHO termios flag could avoid this problem. However, it
cannot be done by user space configuration when download tool open
/dev/ttyACM0. This is because the device running Mediatek Preloader will
send handshake command "READY" immediately once the CDC-ACM driver is
ready.
This patch wants to fix above problem by introducing "DISABLE_ECHO"
property in driver_info. When Mediatek Preloader is connected, the
CDC-ACM driver could disable ECHO flag in termios to avoid the problem.
Signed-off-by: Macpaul Lin <macpaul.lin(a)mediatek.com>
Cc: stable(a)vger.kernel.org
---
Changes for v2:
- Move quirks testing of DISABLE_ECHO flag into acm_tty_install().
- Change quirks testing into bitwise comparison.
Changes for v3:
- Replace quirks testing from init_termios to tty->termios.
- Remove parenthesis for ECHO flag.
Changes for v4:
- Drop quirks varible to simplify the patch.
- Move termios operation right after the driver_data has been installed.
- Write general style comment for suppressing initial echoing.
Changes for v5:
- Fix: termios operation right abover the driver_data has been installed.
- Update commit comment about this patch affects the reconnected device
which get the same minor numbers.
Changes for v6:
- Update VID/PID:0x0e8d/0x0003 as Mediatek Inc BROM.
- Update VID/PID:0x0e8d/0x2000 as Mediatek Inc Preloader.
drivers/usb/class/cdc-acm.c | 14 ++++++++++++--
drivers/usb/class/cdc-acm.h | 1 +
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
index 1b68fed..161e2d4 100644
--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -581,6 +581,13 @@ static int acm_tty_install(struct tty_driver *driver, struct tty_struct *tty)
if (retval)
goto error_init_termios;
+ /*
+ * Suppress initial echoing for some devices which might send data
+ * immediately after acm driver has been installed.
+ */
+ if (acm->quirks & DISABLE_ECHO)
+ tty->termios.c_lflag &= ~ECHO;
+
tty->driver_data = acm;
return 0;
@@ -1654,8 +1661,11 @@ static int acm_pre_reset(struct usb_interface *intf)
{ USB_DEVICE(0x0870, 0x0001), /* Metricom GS Modem */
.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
},
- { USB_DEVICE(0x0e8d, 0x0003), /* FIREFLY, MediaTek Inc; andrey.arapov(a)gmail.com */
- .driver_info = NO_UNION_NORMAL, /* has no union descriptor */
+ { USB_DEVICE(0x0e8d, 0x0003), /* MediaTek Inc BROM */
+ .driver_info = DISABLE_ECHO, /* DISABLE ECHO in termios flag */
+ },
+ { USB_DEVICE(0x0e8d, 0x2000), /* MediaTek Inc Preloader */
+ .driver_info = DISABLE_ECHO, /* DISABLE ECHO in termios flag */
},
{ USB_DEVICE(0x0e8d, 0x3329), /* MediaTek Inc GPS */
.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
diff --git a/drivers/usb/class/cdc-acm.h b/drivers/usb/class/cdc-acm.h
index ca06b20..515aad0 100644
--- a/drivers/usb/class/cdc-acm.h
+++ b/drivers/usb/class/cdc-acm.h
@@ -140,3 +140,4 @@ struct acm {
#define QUIRK_CONTROL_LINE_STATE BIT(6)
#define CLEAR_HALT_CONDITIONS BIT(7)
#define SEND_ZERO_PACKET BIT(8)
+#define DISABLE_ECHO BIT(9)
--
1.9.1
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d57f9da890696af1484f4a47f7f123560197865a Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)wdc.com>
Date: Fri, 30 Nov 2018 15:31:48 +0900
Subject: [PATCH] dm zoned: Fix target BIO completion handling
struct bioctx includes the ref refcount_t to track the number of I/O
fragments used to process a target BIO as well as ensure that the zone
of the BIO is kept in the active state throughout the lifetime of the
BIO. However, since decrementing of this reference count is done in the
target .end_io method, the function bio_endio() must be called multiple
times for read and write target BIOs, which causes problems with the
value of the __bi_remaining struct bio field for chained BIOs (e.g. the
clone BIO passed by dm core is large and splits into fragments by the
block layer), resulting in incorrect values and inconsistencies with the
BIO_CHAIN flag setting. This is turn triggers the BUG_ON() call:
BUG_ON(atomic_read(&bio->__bi_remaining) <= 0);
in bio_remaining_done() called from bio_endio().
Fix this ensuring that bio_endio() is called only once for any target
BIO by always using internal clone BIOs for processing any read or
write target BIO. This allows reference counting using the target BIO
context counter to trigger the target BIO completion bio_endio() call
once all data, metadata and other zone work triggered by the BIO
complete.
Overall, this simplifies the code too as the target .end_io becomes
unnecessary and differences between read and write BIO issuing and
completion processing disappear.
Fixes: 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target")
Cc: stable(a)vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal(a)wdc.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 981154e59461..6af5babe6837 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -20,7 +20,6 @@ struct dmz_bioctx {
struct dm_zone *zone;
struct bio *bio;
refcount_t ref;
- blk_status_t status;
};
/*
@@ -78,65 +77,66 @@ static inline void dmz_bio_endio(struct bio *bio, blk_status_t status)
{
struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
- if (bioctx->status == BLK_STS_OK && status != BLK_STS_OK)
- bioctx->status = status;
- bio_endio(bio);
+ if (status != BLK_STS_OK && bio->bi_status == BLK_STS_OK)
+ bio->bi_status = status;
+
+ if (refcount_dec_and_test(&bioctx->ref)) {
+ struct dm_zone *zone = bioctx->zone;
+
+ if (zone) {
+ if (bio->bi_status != BLK_STS_OK &&
+ bio_op(bio) == REQ_OP_WRITE &&
+ dmz_is_seq(zone))
+ set_bit(DMZ_SEQ_WRITE_ERR, &zone->flags);
+ dmz_deactivate_zone(zone);
+ }
+ bio_endio(bio);
+ }
}
/*
- * Partial clone read BIO completion callback. This terminates the
+ * Completion callback for an internally cloned target BIO. This terminates the
* target BIO when there are no more references to its context.
*/
-static void dmz_read_bio_end_io(struct bio *bio)
+static void dmz_clone_endio(struct bio *clone)
{
- struct dmz_bioctx *bioctx = bio->bi_private;
- blk_status_t status = bio->bi_status;
+ struct dmz_bioctx *bioctx = clone->bi_private;
+ blk_status_t status = clone->bi_status;
- bio_put(bio);
+ bio_put(clone);
dmz_bio_endio(bioctx->bio, status);
}
/*
- * Issue a BIO to a zone. The BIO may only partially process the
+ * Issue a clone of a target BIO. The clone may only partially process the
* original target BIO.
*/
-static int dmz_submit_read_bio(struct dmz_target *dmz, struct dm_zone *zone,
- struct bio *bio, sector_t chunk_block,
- unsigned int nr_blocks)
+static int dmz_submit_bio(struct dmz_target *dmz, struct dm_zone *zone,
+ struct bio *bio, sector_t chunk_block,
+ unsigned int nr_blocks)
{
struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
- sector_t sector;
struct bio *clone;
- /* BIO remap sector */
- sector = dmz_start_sect(dmz->metadata, zone) + dmz_blk2sect(chunk_block);
-
- /* If the read is not partial, there is no need to clone the BIO */
- if (nr_blocks == dmz_bio_blocks(bio)) {
- /* Setup and submit the BIO */
- bio->bi_iter.bi_sector = sector;
- refcount_inc(&bioctx->ref);
- generic_make_request(bio);
- return 0;
- }
-
- /* Partial BIO: we need to clone the BIO */
clone = bio_clone_fast(bio, GFP_NOIO, &dmz->bio_set);
if (!clone)
return -ENOMEM;
- /* Setup the clone */
- clone->bi_iter.bi_sector = sector;
+ bio_set_dev(clone, dmz->dev->bdev);
+ clone->bi_iter.bi_sector =
+ dmz_start_sect(dmz->metadata, zone) + dmz_blk2sect(chunk_block);
clone->bi_iter.bi_size = dmz_blk2sect(nr_blocks) << SECTOR_SHIFT;
- clone->bi_end_io = dmz_read_bio_end_io;
+ clone->bi_end_io = dmz_clone_endio;
clone->bi_private = bioctx;
bio_advance(bio, clone->bi_iter.bi_size);
- /* Submit the clone */
refcount_inc(&bioctx->ref);
generic_make_request(clone);
+ if (bio_op(bio) == REQ_OP_WRITE && dmz_is_seq(zone))
+ zone->wp_block += nr_blocks;
+
return 0;
}
@@ -214,7 +214,7 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
if (nr_blocks) {
/* Valid blocks found: read them */
nr_blocks = min_t(unsigned int, nr_blocks, end_block - chunk_block);
- ret = dmz_submit_read_bio(dmz, rzone, bio, chunk_block, nr_blocks);
+ ret = dmz_submit_bio(dmz, rzone, bio, chunk_block, nr_blocks);
if (ret)
return ret;
chunk_block += nr_blocks;
@@ -228,25 +228,6 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
return 0;
}
-/*
- * Issue a write BIO to a zone.
- */
-static void dmz_submit_write_bio(struct dmz_target *dmz, struct dm_zone *zone,
- struct bio *bio, sector_t chunk_block,
- unsigned int nr_blocks)
-{
- struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
-
- /* Setup and submit the BIO */
- bio_set_dev(bio, dmz->dev->bdev);
- bio->bi_iter.bi_sector = dmz_start_sect(dmz->metadata, zone) + dmz_blk2sect(chunk_block);
- refcount_inc(&bioctx->ref);
- generic_make_request(bio);
-
- if (dmz_is_seq(zone))
- zone->wp_block += nr_blocks;
-}
-
/*
* Write blocks directly in a data zone, at the write pointer.
* If a buffer zone is assigned, invalidate the blocks written
@@ -265,7 +246,9 @@ static int dmz_handle_direct_write(struct dmz_target *dmz,
return -EROFS;
/* Submit write */
- dmz_submit_write_bio(dmz, zone, bio, chunk_block, nr_blocks);
+ ret = dmz_submit_bio(dmz, zone, bio, chunk_block, nr_blocks);
+ if (ret)
+ return ret;
/*
* Validate the blocks in the data zone and invalidate
@@ -301,7 +284,9 @@ static int dmz_handle_buffered_write(struct dmz_target *dmz,
return -EROFS;
/* Submit write */
- dmz_submit_write_bio(dmz, bzone, bio, chunk_block, nr_blocks);
+ ret = dmz_submit_bio(dmz, bzone, bio, chunk_block, nr_blocks);
+ if (ret)
+ return ret;
/*
* Validate the blocks in the buffer zone
@@ -600,7 +585,6 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
bioctx->zone = NULL;
bioctx->bio = bio;
refcount_set(&bioctx->ref, 1);
- bioctx->status = BLK_STS_OK;
/* Set the BIO pending in the flush list */
if (!nr_sectors && bio_op(bio) == REQ_OP_WRITE) {
@@ -623,35 +607,6 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
return DM_MAPIO_SUBMITTED;
}
-/*
- * Completed target BIO processing.
- */
-static int dmz_end_io(struct dm_target *ti, struct bio *bio, blk_status_t *error)
-{
- struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
-
- if (bioctx->status == BLK_STS_OK && *error)
- bioctx->status = *error;
-
- if (!refcount_dec_and_test(&bioctx->ref))
- return DM_ENDIO_INCOMPLETE;
-
- /* Done */
- bio->bi_status = bioctx->status;
-
- if (bioctx->zone) {
- struct dm_zone *zone = bioctx->zone;
-
- if (*error && bio_op(bio) == REQ_OP_WRITE) {
- if (dmz_is_seq(zone))
- set_bit(DMZ_SEQ_WRITE_ERR, &zone->flags);
- }
- dmz_deactivate_zone(zone);
- }
-
- return DM_ENDIO_DONE;
-}
-
/*
* Get zoned device information.
*/
@@ -946,7 +901,6 @@ static struct target_type dmz_type = {
.ctr = dmz_ctr,
.dtr = dmz_dtr,
.map = dmz_map,
- .end_io = dmz_end_io,
.io_hints = dmz_io_hints,
.prepare_ioctl = dmz_prepare_ioctl,
.postsuspend = dmz_suspend,
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f6c367585d0d851349d3a9e607c43e5bea993fa1 Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer(a)redhat.com>
Date: Tue, 11 Dec 2018 13:31:40 -0500
Subject: [PATCH] dm thin: send event about thin-pool state change _after_
making it
Sending a DM event before a thin-pool state change is about to happen is
a bug. It wasn't realized until it became clear that userspace response
to the event raced with the actual state change that the event was
meant to notify about.
Fix this by first updating internal thin-pool state to reflect what the
DM event is being issued about. This fixes a long-standing racey/buggy
userspace device-mapper-test-suite 'resize_io' test that would get an
event but not find the state it was looking for -- so it would just go
on to hang because no other events caused the test to reevaluate the
thin-pool's state.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index 0bd8d498b3b9..53f8d03f76f7 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -195,7 +195,7 @@ static void throttle_unlock(struct throttle *t)
struct dm_thin_new_mapping;
/*
- * The pool runs in 4 modes. Ordered in degraded order for comparisons.
+ * The pool runs in various modes. Ordered in degraded order for comparisons.
*/
enum pool_mode {
PM_WRITE, /* metadata may be changed */
@@ -282,9 +282,38 @@ struct pool {
mempool_t mapping_pool;
};
-static enum pool_mode get_pool_mode(struct pool *pool);
static void metadata_operation_failed(struct pool *pool, const char *op, int r);
+static enum pool_mode get_pool_mode(struct pool *pool)
+{
+ return pool->pf.mode;
+}
+
+static void notify_of_pool_mode_change(struct pool *pool)
+{
+ const char *descs[] = {
+ "write",
+ "out-of-data-space",
+ "read-only",
+ "read-only",
+ "fail"
+ };
+ const char *extra_desc = NULL;
+ enum pool_mode mode = get_pool_mode(pool);
+
+ if (mode == PM_OUT_OF_DATA_SPACE) {
+ if (!pool->pf.error_if_no_space)
+ extra_desc = " (queue IO)";
+ else
+ extra_desc = " (error IO)";
+ }
+
+ dm_table_event(pool->ti->table);
+ DMINFO("%s: switching pool to %s%s mode",
+ dm_device_name(pool->pool_md),
+ descs[(int)mode], extra_desc ? : "");
+}
+
/*
* Target context for a pool.
*/
@@ -2351,8 +2380,6 @@ static void do_waker(struct work_struct *ws)
queue_delayed_work(pool->wq, &pool->waker, COMMIT_PERIOD);
}
-static void notify_of_pool_mode_change_to_oods(struct pool *pool);
-
/*
* We're holding onto IO to allow userland time to react. After the
* timeout either the pool will have been resized (and thus back in
@@ -2365,7 +2392,7 @@ static void do_no_space_timeout(struct work_struct *ws)
if (get_pool_mode(pool) == PM_OUT_OF_DATA_SPACE && !pool->pf.error_if_no_space) {
pool->pf.error_if_no_space = true;
- notify_of_pool_mode_change_to_oods(pool);
+ notify_of_pool_mode_change(pool);
error_retry_list_with_code(pool, BLK_STS_NOSPC);
}
}
@@ -2433,26 +2460,6 @@ static void noflush_work(struct thin_c *tc, void (*fn)(struct work_struct *))
/*----------------------------------------------------------------*/
-static enum pool_mode get_pool_mode(struct pool *pool)
-{
- return pool->pf.mode;
-}
-
-static void notify_of_pool_mode_change(struct pool *pool, const char *new_mode)
-{
- dm_table_event(pool->ti->table);
- DMINFO("%s: switching pool to %s mode",
- dm_device_name(pool->pool_md), new_mode);
-}
-
-static void notify_of_pool_mode_change_to_oods(struct pool *pool)
-{
- if (!pool->pf.error_if_no_space)
- notify_of_pool_mode_change(pool, "out-of-data-space (queue IO)");
- else
- notify_of_pool_mode_change(pool, "out-of-data-space (error IO)");
-}
-
static bool passdown_enabled(struct pool_c *pt)
{
return pt->adjusted_pf.discard_passdown;
@@ -2501,8 +2508,6 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
switch (new_mode) {
case PM_FAIL:
- if (old_mode != new_mode)
- notify_of_pool_mode_change(pool, "failure");
dm_pool_metadata_read_only(pool->pmd);
pool->process_bio = process_bio_fail;
pool->process_discard = process_bio_fail;
@@ -2516,8 +2521,6 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
case PM_OUT_OF_METADATA_SPACE:
case PM_READ_ONLY:
- if (!is_read_only_pool_mode(old_mode))
- notify_of_pool_mode_change(pool, "read-only");
dm_pool_metadata_read_only(pool->pmd);
pool->process_bio = process_bio_read_only;
pool->process_discard = process_bio_success;
@@ -2538,8 +2541,6 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
* alarming rate. Adjust your low water mark if you're
* frequently seeing this mode.
*/
- if (old_mode != new_mode)
- notify_of_pool_mode_change_to_oods(pool);
pool->out_of_data_space = true;
pool->process_bio = process_bio_read_only;
pool->process_discard = process_discard_bio;
@@ -2552,8 +2553,6 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
break;
case PM_WRITE:
- if (old_mode != new_mode)
- notify_of_pool_mode_change(pool, "write");
if (old_mode == PM_OUT_OF_DATA_SPACE)
cancel_delayed_work_sync(&pool->no_space_timeout);
pool->out_of_data_space = false;
@@ -2573,6 +2572,9 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
* doesn't cause an unexpected mode transition on resume.
*/
pt->adjusted_pf.mode = new_mode;
+
+ if (old_mode != new_mode)
+ notify_of_pool_mode_change(pool);
}
static void abort_transaction(struct pool *pool)
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f6c367585d0d851349d3a9e607c43e5bea993fa1 Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer(a)redhat.com>
Date: Tue, 11 Dec 2018 13:31:40 -0500
Subject: [PATCH] dm thin: send event about thin-pool state change _after_
making it
Sending a DM event before a thin-pool state change is about to happen is
a bug. It wasn't realized until it became clear that userspace response
to the event raced with the actual state change that the event was
meant to notify about.
Fix this by first updating internal thin-pool state to reflect what the
DM event is being issued about. This fixes a long-standing racey/buggy
userspace device-mapper-test-suite 'resize_io' test that would get an
event but not find the state it was looking for -- so it would just go
on to hang because no other events caused the test to reevaluate the
thin-pool's state.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index 0bd8d498b3b9..53f8d03f76f7 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -195,7 +195,7 @@ static void throttle_unlock(struct throttle *t)
struct dm_thin_new_mapping;
/*
- * The pool runs in 4 modes. Ordered in degraded order for comparisons.
+ * The pool runs in various modes. Ordered in degraded order for comparisons.
*/
enum pool_mode {
PM_WRITE, /* metadata may be changed */
@@ -282,9 +282,38 @@ struct pool {
mempool_t mapping_pool;
};
-static enum pool_mode get_pool_mode(struct pool *pool);
static void metadata_operation_failed(struct pool *pool, const char *op, int r);
+static enum pool_mode get_pool_mode(struct pool *pool)
+{
+ return pool->pf.mode;
+}
+
+static void notify_of_pool_mode_change(struct pool *pool)
+{
+ const char *descs[] = {
+ "write",
+ "out-of-data-space",
+ "read-only",
+ "read-only",
+ "fail"
+ };
+ const char *extra_desc = NULL;
+ enum pool_mode mode = get_pool_mode(pool);
+
+ if (mode == PM_OUT_OF_DATA_SPACE) {
+ if (!pool->pf.error_if_no_space)
+ extra_desc = " (queue IO)";
+ else
+ extra_desc = " (error IO)";
+ }
+
+ dm_table_event(pool->ti->table);
+ DMINFO("%s: switching pool to %s%s mode",
+ dm_device_name(pool->pool_md),
+ descs[(int)mode], extra_desc ? : "");
+}
+
/*
* Target context for a pool.
*/
@@ -2351,8 +2380,6 @@ static void do_waker(struct work_struct *ws)
queue_delayed_work(pool->wq, &pool->waker, COMMIT_PERIOD);
}
-static void notify_of_pool_mode_change_to_oods(struct pool *pool);
-
/*
* We're holding onto IO to allow userland time to react. After the
* timeout either the pool will have been resized (and thus back in
@@ -2365,7 +2392,7 @@ static void do_no_space_timeout(struct work_struct *ws)
if (get_pool_mode(pool) == PM_OUT_OF_DATA_SPACE && !pool->pf.error_if_no_space) {
pool->pf.error_if_no_space = true;
- notify_of_pool_mode_change_to_oods(pool);
+ notify_of_pool_mode_change(pool);
error_retry_list_with_code(pool, BLK_STS_NOSPC);
}
}
@@ -2433,26 +2460,6 @@ static void noflush_work(struct thin_c *tc, void (*fn)(struct work_struct *))
/*----------------------------------------------------------------*/
-static enum pool_mode get_pool_mode(struct pool *pool)
-{
- return pool->pf.mode;
-}
-
-static void notify_of_pool_mode_change(struct pool *pool, const char *new_mode)
-{
- dm_table_event(pool->ti->table);
- DMINFO("%s: switching pool to %s mode",
- dm_device_name(pool->pool_md), new_mode);
-}
-
-static void notify_of_pool_mode_change_to_oods(struct pool *pool)
-{
- if (!pool->pf.error_if_no_space)
- notify_of_pool_mode_change(pool, "out-of-data-space (queue IO)");
- else
- notify_of_pool_mode_change(pool, "out-of-data-space (error IO)");
-}
-
static bool passdown_enabled(struct pool_c *pt)
{
return pt->adjusted_pf.discard_passdown;
@@ -2501,8 +2508,6 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
switch (new_mode) {
case PM_FAIL:
- if (old_mode != new_mode)
- notify_of_pool_mode_change(pool, "failure");
dm_pool_metadata_read_only(pool->pmd);
pool->process_bio = process_bio_fail;
pool->process_discard = process_bio_fail;
@@ -2516,8 +2521,6 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
case PM_OUT_OF_METADATA_SPACE:
case PM_READ_ONLY:
- if (!is_read_only_pool_mode(old_mode))
- notify_of_pool_mode_change(pool, "read-only");
dm_pool_metadata_read_only(pool->pmd);
pool->process_bio = process_bio_read_only;
pool->process_discard = process_bio_success;
@@ -2538,8 +2541,6 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
* alarming rate. Adjust your low water mark if you're
* frequently seeing this mode.
*/
- if (old_mode != new_mode)
- notify_of_pool_mode_change_to_oods(pool);
pool->out_of_data_space = true;
pool->process_bio = process_bio_read_only;
pool->process_discard = process_discard_bio;
@@ -2552,8 +2553,6 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
break;
case PM_WRITE:
- if (old_mode != new_mode)
- notify_of_pool_mode_change(pool, "write");
if (old_mode == PM_OUT_OF_DATA_SPACE)
cancel_delayed_work_sync(&pool->no_space_timeout);
pool->out_of_data_space = false;
@@ -2573,6 +2572,9 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
* doesn't cause an unexpected mode transition on resume.
*/
pt->adjusted_pf.mode = new_mode;
+
+ if (old_mode != new_mode)
+ notify_of_pool_mode_change(pool);
}
static void abort_transaction(struct pool *pool)
Mediatek Preloader is a proprietary embedded boot loader for loading
Little Kernel and Linux into device DRAM.
This boot loader also handle firmware update. Mediatek Preloader will be
enumerated as a virtual COM port when the device is connected to Windows
or Linux OS via CDC-ACM class driver. When the USB enumeration has been
done, Mediatek Preloader will send out handshake command "READY" to PC
actively instead of waiting command from the download tool.
Since Linux 4.12, the commit "tty: reset termios state on device
registration" (93857edd9829e144acb6c7e72d593f6e01aead66) causes Mediatek
Preloader receiving some abnoraml command like "READYXX" as it sent.
This will be recognized as an incorrect response. The behavior change
also causes the download handshake fail.
By disabling the ECHO termios flag could avoid this problem. However, it
cannot be done by user space configuration when download tool open
/dev/ttyACM0. This is because the device running Mediatek Preloader will
send handshake command "READY" immediately once the CDC-ACM driver is
ready.
This patch wants to fix above problem by introducing "DISABLE_ECHO"
property in driver_info. When Mediatek Preloader is connected, the
CDC-ACM driver could disable ECHO flag in termios to avoid the problem.
Signed-off-by: Macpaul Lin <macpaul.lin(a)mediatek.com>
---
Changes for v2:
- Move quirks testing of DISABLE_ECHO flag into acm_tty_install().
- Change quirks testing into bitwise comparison.
Changes for v3:
- Replace quirks testing from init_termios to tty->termios.
- Remove parenthesis for ECHO flag.
Changes for v4:
- Drop quirks varible to simplify the patch.
- Move termios operation right after the driver_data has been installed.
- Write general style comment for suppressing initial echoing.
drivers/usb/class/cdc-acm.c | 12 +++++++++++-
drivers/usb/class/cdc-acm.h | 1 +
2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
index 1b68fed..ab8609e 100644
--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -583,6 +583,13 @@ static int acm_tty_install(struct tty_driver *driver, struct tty_struct *tty)
tty->driver_data = acm;
+ /*
+ * Suppress initial echoing for some devices which might send data
+ * immediately after acm driver has been installed.
+ */
+ if (acm->quirks & DISABLE_ECHO)
+ tty->termios.c_lflag &= ~ECHO;
+
return 0;
error_init_termios:
@@ -1655,7 +1662,10 @@ static int acm_pre_reset(struct usb_interface *intf)
.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
},
{ USB_DEVICE(0x0e8d, 0x0003), /* FIREFLY, MediaTek Inc; andrey.arapov(a)gmail.com */
- .driver_info = NO_UNION_NORMAL, /* has no union descriptor */
+ .driver_info = DISABLE_ECHO, /* DISABLE ECHO in termios flag */
+ },
+ { USB_DEVICE(0x0e8d, 0x2000), /* FIREFLY, MediaTek Inc; Preloader */
+ .driver_info = DISABLE_ECHO, /* DISABLE ECHO in termios flag */
},
{ USB_DEVICE(0x0e8d, 0x3329), /* MediaTek Inc GPS */
.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
diff --git a/drivers/usb/class/cdc-acm.h b/drivers/usb/class/cdc-acm.h
index ca06b20..515aad0 100644
--- a/drivers/usb/class/cdc-acm.h
+++ b/drivers/usb/class/cdc-acm.h
@@ -140,3 +140,4 @@ struct acm {
#define QUIRK_CONTROL_LINE_STATE BIT(6)
#define CLEAR_HALT_CONDITIONS BIT(7)
#define SEND_ZERO_PACKET BIT(8)
+#define DISABLE_ECHO BIT(9)
--
1.9.1
Mediatek Preloader is a proprietary embedded boot loader for loading
Little Kernel and Linux into device DRAM.
This boot loader also handle firmware update. Mediatek Preloader will be
enumerated as a virtual COM port when the device is connected to Windows
or Linux OS via CDC-ACM class driver. When the USB enumeration has been
done, Mediatek Preloader will send out handshake command "READY" to PC
actively instead of waiting command from the download tool.
Since Linux 4.12, the commit "tty: reset termios state on device
registration" (93857edd9829e144acb6c7e72d593f6e01aead66) causes Mediatek
Preloader receiving some abnoraml command like "READYXX" as it sent.
This will be recognized as an incorrect response. The behavior change
also causes the download handshake fail.
By disabling the ECHO termios flag could avoid this problem. However, it
cannot be done by user space configuration when download tool open
/dev/ttyACM0. This is because the device running Mediatek Preloader will
send handshake command "READY" immediately once the CDC-ACM driver is
ready.
This patch wants to fix above problem by introducing "DISABLE_ECHO"
property in driver_info. When Mediatek Preloader is connected, the
CDC-ACM driver could disable ECHO flag in termios to avoid the problem.
Signed-off-by: Macpaul Lin <macpaul.lin(a)mediatek.com>
---
Changes for v2:
- Move quirks testing of DISABLE_ECHO flag into acm_tty_install().
- Change quirks testing into bitwise comparison.
Changes for v3:
- Replace clear flag operation from driver->init_termios to tty->termios.
- Remove parenthesis of ECHO flag.
drivers/usb/class/cdc-acm.c | 13 ++++++++++++-
drivers/usb/class/cdc-acm.h | 1 +
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
index 1b68fed..c1b88c3 100644
--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -571,6 +571,7 @@ static void acm_softint(struct work_struct *work)
static int acm_tty_install(struct tty_driver *driver, struct tty_struct *tty)
{
struct acm *acm;
+ unsigned long quirks;
int retval;
acm = acm_get_by_minor(tty->index);
@@ -583,6 +584,13 @@ static int acm_tty_install(struct tty_driver *driver, struct tty_struct *tty)
tty->driver_data = acm;
+ /* get normal quirks */
+ quirks = acm->quirks;
+
+ /* handle active handshake triggered by device */
+ if (quirks & DISABLE_ECHO)
+ tty->termios.c_lflag &= ~ECHO;
+
return 0;
error_init_termios:
@@ -1655,7 +1663,10 @@ static int acm_pre_reset(struct usb_interface *intf)
.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
},
{ USB_DEVICE(0x0e8d, 0x0003), /* FIREFLY, MediaTek Inc; andrey.arapov(a)gmail.com */
- .driver_info = NO_UNION_NORMAL, /* has no union descriptor */
+ .driver_info = DISABLE_ECHO, /* DISABLE ECHO in termios flag */
+ },
+ { USB_DEVICE(0x0e8d, 0x2000), /* FIREFLY, MediaTek Inc; Preloader */
+ .driver_info = DISABLE_ECHO, /* DISABLE ECHO in termios flag */
},
{ USB_DEVICE(0x0e8d, 0x3329), /* MediaTek Inc GPS */
.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
diff --git a/drivers/usb/class/cdc-acm.h b/drivers/usb/class/cdc-acm.h
index ca06b20..515aad0 100644
--- a/drivers/usb/class/cdc-acm.h
+++ b/drivers/usb/class/cdc-acm.h
@@ -140,3 +140,4 @@ struct acm {
#define QUIRK_CONTROL_LINE_STATE BIT(6)
#define CLEAR_HALT_CONDITIONS BIT(7)
#define SEND_ZERO_PACKET BIT(8)
+#define DISABLE_ECHO BIT(9)
--
1.9.1
Mediatek Preloader is a proprietary embedded boot loader for loading
Little Kernel and Linux into device DRAM.
This boot loader also handle firmware update. Mediatek Preloader will be
enumerated as a virtual COM port when the device is connected to Windows
or Linux OS via CDC-ACM class driver. When the USB enumeration has been
done, Mediatek Preloader will send out handshake command "READY" to PC
actively instead of waiting command from the download tool.
Since Linux 4.12, the commit "tty: reset termios state on device
registration" (93857edd9829e144acb6c7e72d593f6e01aead66) causes Mediatek
Preloader receiving some abnoraml command like "READYXX" as it sent.
This will be recognized as an incorrect response. The behavior change
also causes the download handshake fail.
By disabling the ECHO termios flag could avoid this problem. However, it
cannot be done by user space configuration when download tool open
/dev/ttyACM0. This is because the device running Mediatek Preloader will
send handshake command "READY" immediately once the CDC-ACM driver is
ready.
This patch wants to fix above problem by introducing "DISABLE_ECHO"
property in driver_info. When Mediatek Preloader is connected, the
CDC-ACM driver could disable ECHO flag in termios to avoid the problem.
Signed-off-by: Macpaul Lin <macpaul.lin(a)mediatek.com>
---
Change for v2:
- Move quirks testing of DISABLE_ECHO flag into acm_tty_install().
- Change quirks testing into bitwise comparison.
drivers/usb/class/cdc-acm.c | 13 ++++++++++++-
drivers/usb/class/cdc-acm.h | 1 +
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
index 1b68fed..f1a914d 100644
--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -571,12 +571,20 @@ static void acm_softint(struct work_struct *work)
static int acm_tty_install(struct tty_driver *driver, struct tty_struct *tty)
{
struct acm *acm;
+ unsigned long quirks;
int retval;
acm = acm_get_by_minor(tty->index);
if (!acm)
return -ENODEV;
+ /* get normal quirks */
+ quirks = acm->quirks;
+
+ /* handle active handshake triggered by device */
+ if (quirks & DISABLE_ECHO)
+ driver->init_termios.c_lflag &= ~(ECHO);
+
retval = tty_standard_install(driver, tty);
if (retval)
goto error_init_termios;
@@ -1655,7 +1663,10 @@ static int acm_pre_reset(struct usb_interface *intf)
.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
},
{ USB_DEVICE(0x0e8d, 0x0003), /* FIREFLY, MediaTek Inc; andrey.arapov(a)gmail.com */
- .driver_info = NO_UNION_NORMAL, /* has no union descriptor */
+ .driver_info = DISABLE_ECHO, /* DISABLE ECHO in termios flag */
+ },
+ { USB_DEVICE(0x0e8d, 0x2000), /* FIREFLY, MediaTek Inc; Preloader */
+ .driver_info = DISABLE_ECHO, /* DISABLE ECHO in termios flag */
},
{ USB_DEVICE(0x0e8d, 0x3329), /* MediaTek Inc GPS */
.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
diff --git a/drivers/usb/class/cdc-acm.h b/drivers/usb/class/cdc-acm.h
index ca06b20..515aad0 100644
--- a/drivers/usb/class/cdc-acm.h
+++ b/drivers/usb/class/cdc-acm.h
@@ -140,3 +140,4 @@ struct acm {
#define QUIRK_CONTROL_LINE_STATE BIT(6)
#define CLEAR_HALT_CONDITIONS BIT(7)
#define SEND_ZERO_PACKET BIT(8)
+#define DISABLE_ECHO BIT(9)
--
1.9.1
From: Long Li <longli(a)microsoft.com>
The current code attempts to pin memory using the largest possible wsize
based on the currect SMB credits. This doesn't cause kernel oops but this is
not optimal as we may pin more pages then actually needed.
Fix this by only pinning what are needed for doing this write I/O.
Signed-off-by: Long Li <longli(a)microsoft.com>
Cc: stable(a)vger.kernel.org
---
fs/cifs/file.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 3467351..c23bf9d 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2617,11 +2617,13 @@ cifs_write_from_iter(loff_t offset, size_t len, struct iov_iter *from,
if (rc)
break;
+ cur_len = min_t(const size_t, len, wsize);
+
if (ctx->direct_io) {
ssize_t result;
result = iov_iter_get_pages_alloc(
- from, &pagevec, wsize, &start);
+ from, &pagevec, cur_len, &start);
if (result < 0) {
cifs_dbg(VFS,
"direct_writev couldn't get user pages "
--
2.7.4
Hi folks,
I recently started seeing spurious test failures with rbtree tests,
resulting in boot delays and random "hung task" warnings.
The problems have been fixed upstream with the following patches.
v4.14.y:
0b548e33e6cb lib/rbtree-test: lower default params
v4.4.y, v4.9.y:
a54dae0338b7 lib/interval_tree_test.c: make test options module parameters
c46ecce431eb lib/interval_tree_test.c: allow full tree search
223f8911eace lib/rbtree_test.c: make input module parameters
0b548e33e6cb lib/rbtree-test: lower default params
The first three patches for v4.4.y and v4.9.y are the minimum
set of context patches needed to avoid conflicts when applying
commit 0b548e33e6cb.
I tested all kernel versions with the patches applied and rbtree
testing enabled to ensure that no new problems are introduced.
Please consider applying those patches to the respective releases.
Thanks,
Guenter
From: Emmanuel Grumbach <emmanuel.grumbach(a)intel.com>
NullFunc packets should never be duplicate just like
QoS-NullFunc packets.
We saw a client that enters / exits power save with
NullFunc frames (and not with QoS-NullFunc) despite the
fact that the association supports HT.
This specific client also re-uses a non-zero sequence number
for different NullFunc frames.
At some point, the client had to send a retransmission of
the NullFunc frame and we dropped it, leading to a
misalignment in the power save state.
Fix this by never consider a NullFunc frame as duplicate,
just like we do for QoS NullFunc frames.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=201449
CC: <stable(a)vger.kernel.org>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach(a)intel.com>
Signed-off-by: Luca Coelho <luciano.coelho(a)intel.com>
---
net/mac80211/rx.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index 2394008f82b9..60d179bf2585 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -1404,6 +1404,7 @@ ieee80211_rx_h_check_dup(struct ieee80211_rx_data *rx)
return RX_CONTINUE;
if (ieee80211_is_ctl(hdr->frame_control) ||
+ ieee80211_is_nullfunc(hdr->frame_control) ||
ieee80211_is_qos_nullfunc(hdr->frame_control) ||
is_multicast_ether_addr(hdr->addr1))
return RX_CONTINUE;
--
2.19.2
[ Upstream commit 64c3f648c25d108f346fdc96c15180c6b7d250e9 ]
Once in a while I see build errors similar to the following
when building images from a clean tree.
Building powerpc:virtex-ml507:44x/virtex5_defconfig ... failed
------------
Error log:
arch/powerpc/boot/treeboot-akebono.c:37:20: fatal error:
libfdt.h: No such file or directory
Building powerpc:bamboo:smpdev:44x/bamboo_defconfig ... failed
------------
Error log:
arch/powerpc/boot/treeboot-akebono.c:37:20: fatal error:
libfdt.h: No such file or directory
arch/powerpc/boot/treeboot-currituck.c:35:20: fatal error:
libfdt.h: No such file or directory
Rebuilds will succeed.
Turns out that several source files in arch/powerpc/boot/ include
libfdt.h, but Makefile dependencies are incomplete. Let's fix that.
Signed-off-by: Guenter Roeck <linux(a)roeck-us.net>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
[groeck: Backport to v4.4.y]
Signed-off-by: Guenter Roeck <linux(a)roeck-us.net>
---
For some reason I see the build error fixed with this patch more often
lately. It would be great if you can apply it to v3.18.y and v4.4.y.
Tested on both v3.18.y and v4.4.y.
arch/powerpc/boot/Makefile | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 99e4487248ff..57003d1bd243 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -70,7 +70,8 @@ $(addprefix $(obj)/,$(zlib) cuboot-c2k.o gunzip_util.o main.o): \
libfdt := fdt.c fdt_ro.c fdt_wip.c fdt_sw.c fdt_rw.c fdt_strerror.c
libfdtheader := fdt.h libfdt.h libfdt_internal.h
-$(addprefix $(obj)/,$(libfdt) libfdt-wrapper.o simpleboot.o epapr.o): \
+$(addprefix $(obj)/,$(libfdt) libfdt-wrapper.o simpleboot.o epapr.o \
+ treeboot-akebono.o treeboot-currituck.o treeboot-iss4xx.o): \
$(addprefix $(obj)/,$(libfdtheader))
src-wlib-y := string.S crt0.S crtsavres.S stdio.c main.c \
--
2.7.4
Please pick this commit for 4.14 and older stable branches:
commit 8e7df2b5b7f245c9bd11064712db5cb69044a362
Author: Ingo Molnar <mingo(a)kernel.org>
Date: Mon Nov 13 07:15:41 2017 +0100
timer/debug: Change /proc/timer_list from 0444 to 0400
In older kernel versions this file makes it far too easy to exploit
arbitrary-write bugs. It's possible to hide the pointers from
unprivileged users by setting the kernel.kptr_restrict sysctl, but that
wasn't done by default.
(Upstream commits c1eba5bcb643 "timer: Pass timer_list pointer to
callbacks unconditionally" and ad67b74d2469 "printk: hash addresses
printed with %p" provide more general mitigations, but don't seem to be
suitable for stable.)
Ben.
--
Ben Hutchings, Software Developer Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
Manchester, M1 2HF, United Kingdom
The patch titled
Subject: fork,memcg: fix crash in free_thread_stack on memcg charge fail
has been added to the -mm tree. Its filename is
forkmemcg-fix-crash-in-free_thread_stack-on-memcg-charge-fail.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/forkmemcg-fix-crash-in-free_thread…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/forkmemcg-fix-crash-in-free_thread…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Rik van Riel <riel(a)surriel.com>
Subject: fork,memcg: fix crash in free_thread_stack on memcg charge fail
Changeset 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting")
will result in fork failing if allocating a kernel stack for a task
in dup_task_struct exceeds the kernel memory allowance for that cgroup.
Unfortunately, it also results in a crash.
This is due to the code jumping to free_stack and calling free_thread_stack
when the memcg kernel stack charge fails, but without tsk->stack pointing
at the freshly allocated stack.
This in turn results in the vfree_atomic in free_thread_stack oopsing
with a backtrace like this:
#5 [ffffc900244efc88] die at ffffffff8101f0ab
#6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86
#7 [ffffc900244efce0] general_protection at ffffffff818ff082
[exception RIP: llist_add_batch+7]
RIP: ffffffff8150d487 RSP: ffffc900244efd98 RFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88085ef55980 RCX: 0000000000000000
RDX: ffff88085ef55980 RSI: 343834343531203a RDI: 343834343531203a
RBP: ffffc900244efd98 R8: 0000000000000001 R9: ffff8808578c3600
R10: 0000000000000000 R11: 0000000000000001 R12: ffff88029f6c21c0
R13: 0000000000000286 R14: ffff880147759b00 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7
#9 [ffffc900244efdb8] copy_process at ffffffff81086e37
#10 [ffffc900244efe98] _do_fork at ffffffff810884e0
#11 [ffffc900244eff10] sys_vfork at ffffffff810887ff
#12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43
RIP: 000000000049b948 RSP: 00007ffcdb307830 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 0000000000896030 RCX: 000000000049b948
RDX: 0000000000000000 RSI: 00007ffcdb307790 RDI: 00000000005d7421
RBP: 000000000067370f R8: 00007ffcdb3077b0 R9: 000000000001ed00
R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000040
R13: 000000000000000f R14: 0000000000000000 R15: 000000000088d018
ORIG_RAX: 000000000000003a CS: 0033 SS: 002b
The simplest fix is to assign tsk->stack right where it is allocated.
Link: http://lkml.kernel.org/r/20181214231726.7ee4843c@imladris.surriel.com
Fixes: 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting")
Signed-off-by: Rik van Riel <riel(a)surriel.com>
Acked-by: Roman Gushchin <guro(a)fb.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/fork.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/kernel/fork.c~forkmemcg-fix-crash-in-free_thread_stack-on-memcg-charge-fail
+++ a/kernel/fork.c
@@ -240,8 +240,10 @@ static unsigned long *alloc_thread_stack
* free_thread_stack() can be called in interrupt context,
* so cache the vm_struct.
*/
- if (stack)
+ if (stack) {
tsk->stack_vm_area = find_vm_area(stack);
+ tsk->stack = stack;
+ }
return stack;
#else
struct page *page = alloc_pages_node(node, THREADINFO_GFP,
@@ -288,7 +290,10 @@ static struct kmem_cache *thread_stack_c
static unsigned long *alloc_thread_stack_node(struct task_struct *tsk,
int node)
{
- return kmem_cache_alloc_node(thread_stack_cache, THREADINFO_GFP, node);
+ unsigned long *stack;
+ stack = kmem_cache_alloc_node(thread_stack_cache, THREADINFO_GFP, node);
+ tsk->stack = stack;
+ return stack;
}
static void free_thread_stack(struct task_struct *tsk)
_
Patches currently in -mm which might be from riel(a)surriel.com are
forkmemcg-fix-crash-in-free_thread_stack-on-memcg-charge-fail.patch
The patch titled
Subject: proc/sysctl: don't return ENOMEM on lookup when a table is unregistering
has been removed from the -mm tree. Its filename was
proc-sysctl-dont-return-enomem-on-lookup-when-a-table-is-unregistering.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Ivan Delalande <colona(a)arista.com>
Subject: proc/sysctl: don't return ENOMEM on lookup when a table is unregistering
proc_sys_lookup can fail with ENOMEM instead of ENOENT when the
corresponding sysctl table is being unregistered. In our case we see this
upon opening /proc/sys/net/*/conf files while network interfaces are being
deleted, which confuses our configuration daemon.
The problem was successfully reproduced and this fix tested on v4.9.122
and v4.20-rc6.
Link: http://lkml.kernel.org/r/20181213232052.GA1513@visor
Fixes: ace0c791e6c3 ("proc/sysctl: Don't grab i_lock under sysctl_lock.")
Signed-off-by: Ivan Delalande <colona(a)arista.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm(a)xmission.com>
Cc: Alexey Dobriyan <adobriyan(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/proc_sysctl.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
--- a/fs/proc/proc_sysctl.c~proc-sysctl-dont-return-enomem-on-lookup-when-a-table-is-unregistering
+++ a/fs/proc/proc_sysctl.c
@@ -464,7 +464,7 @@ static struct inode *proc_sys_make_inode
inode = new_inode(sb);
if (!inode)
- goto out;
+ return ERR_PTR(-ENOMEM);
inode->i_ino = get_next_ino();
@@ -474,7 +474,7 @@ static struct inode *proc_sys_make_inode
if (unlikely(head->unregistering)) {
spin_unlock(&sysctl_lock);
iput(inode);
- inode = NULL;
+ inode = ERR_PTR(-ENOENT);
goto out;
}
ei->sysctl = head;
@@ -549,10 +549,11 @@ static struct dentry *proc_sys_lookup(st
goto out;
}
- err = ERR_PTR(-ENOMEM);
inode = proc_sys_make_inode(dir->i_sb, h ? h : head, p);
- if (!inode)
+ if (IS_ERR(inode)) {
+ err = ERR_CAST(inode);
goto out;
+ }
d_set_d_op(dentry, &proc_sys_dentry_operations);
err = d_splice_alias(inode, dentry);
@@ -685,7 +686,7 @@ static bool proc_sys_fill_cache(struct f
if (d_in_lookup(child)) {
struct dentry *res;
inode = proc_sys_make_inode(dir->d_sb, head, table);
- if (!inode) {
+ if (IS_ERR(inode)) {
d_lookup_done(child);
dput(child);
return false;
_
Patches currently in -mm which might be from colona(a)arista.com are
The patch titled
Subject: scripts/spdxcheck.py: always open files in binary mode
has been removed from the -mm tree. Its filename was
scripts-spdxcheckpy-always-open-files-in-binary-mode.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Thierry Reding <treding(a)nvidia.com>
Subject: scripts/spdxcheck.py: always open files in binary mode
The spdxcheck script currently falls over when confronted with a binary
file (such as Documentation/logo.gif). To avoid that, always open files
in binary mode and decode line-by-line, ignoring encoding errors.
One tricky case is when piping data into the script and reading it from
standard input. By default, standard input will be opened in text mode,
so we need to reopen it in binary mode.
The breakage only happens with python3 and results in a
UnicodeDecodeError (according to Uwe).
Link: http://lkml.kernel.org/r/20181212131210.28024-1-thierry.reding@gmail.com
Fixes: 6f4d29df66ac ("scripts/spdxcheck.py: make python3 compliant")
Signed-off-by: Thierry Reding <treding(a)nvidia.com>
Reviewed-by: Jeremy Cline <jcline(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Joe Perches <joe(a)perches.com>
Cc: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
scripts/spdxcheck.py | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/scripts/spdxcheck.py~scripts-spdxcheckpy-always-open-files-in-binary-mode
+++ a/scripts/spdxcheck.py
@@ -168,6 +168,7 @@ class id_parser(object):
self.curline = 0
try:
for line in fd:
+ line = line.decode(locale.getpreferredencoding(False), errors='ignore')
self.curline += 1
if self.curline > maxlines:
break
@@ -249,12 +250,13 @@ if __name__ == '__main__':
try:
if len(args.path) and args.path[0] == '-':
- parser.parse_lines(sys.stdin, args.maxlines, '-')
+ stdin = os.fdopen(sys.stdin.fileno(), 'rb')
+ parser.parse_lines(stdin, args.maxlines, '-')
else:
if args.path:
for p in args.path:
if os.path.isfile(p):
- parser.parse_lines(open(p), args.maxlines, p)
+ parser.parse_lines(open(p, 'rb'), args.maxlines, p)
elif os.path.isdir(p):
scan_git_subtree(repo.head.reference.commit.tree, p)
else:
_
Patches currently in -mm which might be from treding(a)nvidia.com are
scripts-add-spdxcheckpy-self-test.patch
The patch titled
Subject: userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered
has been removed from the -mm tree. Its filename was
userfaultfd-check-vm_maywrite-was-set-after-verifying-the-uffd-is-registered.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Andrea Arcangeli <aarcange(a)redhat.com>
Subject: userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered
Calling UFFDIO_UNREGISTER on virtual ranges not yet registered in uffd
could trigger an harmless false positive WARN_ON. Check the vma is
already registered before checking VM_MAYWRITE to shut off the false
positive warning.
Link: http://lkml.kernel.org/r/20181206212028.18726-2-aarcange@redhat.com
Cc: <stable(a)vger.kernel.org>
Fixes: 29ec90660d68 ("userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas")
Signed-off-by: Andrea Arcangeli <aarcange(a)redhat.com>
Reported-by: syzbot+06c7092e7d71218a2c16(a)syzkaller.appspotmail.com
Acked-by: Mike Rapoport <rppt(a)linux.ibm.com>
Acked-by: Hugh Dickins <hughd(a)google.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/userfaultfd.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/userfaultfd.c~userfaultfd-check-vm_maywrite-was-set-after-verifying-the-uffd-is-registered
+++ a/fs/userfaultfd.c
@@ -1566,7 +1566,6 @@ static int userfaultfd_unregister(struct
cond_resched();
BUG_ON(!vma_can_userfault(vma));
- WARN_ON(!(vma->vm_flags & VM_MAYWRITE));
/*
* Nothing to do: this vma is already registered into this
@@ -1575,6 +1574,8 @@ static int userfaultfd_unregister(struct
if (!vma->vm_userfaultfd_ctx.ctx)
goto skip;
+ WARN_ON(!(vma->vm_flags & VM_MAYWRITE));
+
if (vma->vm_start > start)
start = vma->vm_start;
vma_end = min(end, vma->vm_end);
_
Patches currently in -mm which might be from aarcange(a)redhat.com are
The patch titled
Subject: fs/iomap.c: get/put the page in iomap_page_create/release()
has been removed from the -mm tree. Its filename was
iomap-get-put-the-page-in-iomap_page_create-release.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Piotr Jaroszynski <pjaroszynski(a)nvidia.com>
Subject: fs/iomap.c: get/put the page in iomap_page_create/release()
migrate_page_move_mapping() expects pages with private data set to have a
page_count elevated by 1. This is what used to happen for xfs through the
buffer_heads code before the switch to iomap in commit 82cb14175e7d ("xfs:
add support for sub-pagesize writeback without buffer_heads"). Not having
the count elevated causes move_pages() to fail on memory mapped files
coming from xfs.
Make iomap compatible with the migrate_page_move_mapping() assumption by
elevating the page count as part of iomap_page_create() and lowering it in
iomap_page_release().
It causes the move_pages() syscall to misbehave on memory mapped files
from xfs. It does not not move any pages, which I suppose is "just" a
perf issue, but it also ends up returning a positive number which is
out of spec for the syscall. Talking to Michal Hocko, it sounds like
returning positive numbers might be a necessary update to move_pages()
anyway though
(https://lkml.kernel.org/r/20181116114955.GJ14706@dhcp22.suse.cz).
I only hit this in tests that verify that move_pages() actually moved
the pages. The test also got confused by the positive return from
move_pages() (it got treated as a success as positive numbers were not
expected and not handled) making it a bit harder to track down what's
going on.
Link: http://lkml.kernel.org/r/20181115184140.1388751-1-pjaroszynski@nvidia.com
Fixes: 82cb14175e7d ("xfs: add support for sub-pagesize writeback without buffer_heads")
Signed-off-by: Piotr Jaroszynski <pjaroszynski(a)nvidia.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Cc: William Kucharski <william.kucharski(a)oracle.com>
Cc: Darrick J. Wong <darrick.wong(a)oracle.com>
Cc: Brian Foster <bfoster(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/iomap.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/fs/iomap.c~iomap-get-put-the-page-in-iomap_page_create-release
+++ a/fs/iomap.c
@@ -116,6 +116,12 @@ iomap_page_create(struct inode *inode, s
atomic_set(&iop->read_count, 0);
atomic_set(&iop->write_count, 0);
bitmap_zero(iop->uptodate, PAGE_SIZE / SECTOR_SIZE);
+
+ /*
+ * migrate_page_move_mapping() assumes that pages with private data have
+ * their count elevated by 1.
+ */
+ get_page(page);
set_page_private(page, (unsigned long)iop);
SetPagePrivate(page);
return iop;
@@ -132,6 +138,7 @@ iomap_page_release(struct page *page)
WARN_ON_ONCE(atomic_read(&iop->write_count));
ClearPagePrivate(page);
set_page_private(page, 0);
+ put_page(page);
kfree(iop);
}
_
Patches currently in -mm which might be from pjaroszynski(a)nvidia.com are
This is a note to let you know that I've just added the patch titled
staging: bcm2835-audio: double free in init error path
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 649496b603000135683ee76d7ea499456617bf17 Mon Sep 17 00:00:00 2001
From: Dan Carpenter <dan.carpenter(a)oracle.com>
Date: Mon, 17 Dec 2018 10:08:54 +0300
Subject: staging: bcm2835-audio: double free in init error path
We free instance here and in the caller. It should be only the caller
which handles it.
Fixes: d7ca3a71545b ("staging: bcm2835-audio: Operate non-atomic PCM ops")
Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Reviewed-by: Takashi Iwai <tiwai(a)suse.de>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
index 0db412fd7c55..c0debdbce26c 100644
--- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
+++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
@@ -138,7 +138,6 @@ vc_vchi_audio_init(VCHI_INSTANCE_T vchi_instance,
dev_err(instance->dev,
"failed to open VCHI service connection (status=%d)\n",
status);
- kfree(instance);
return -EPERM;
}
--
2.20.1
This is a note to let you know that I've just added the patch titled
usb: roles: Add a description for the class to Kconfig
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From c3788cd9963eb2e77de3c24142fb7c67b61f1a26 Mon Sep 17 00:00:00 2001
From: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Date: Wed, 12 Dec 2018 20:13:55 +0300
Subject: usb: roles: Add a description for the class to Kconfig
That makes the USB role switch support option visible and
selectable for the user. The class driver is also moved to
drivers/usb/roles/ directory.
This will fix an issue that we have with the Intel USB role
switch driver on systems that don't have USB Type-C connectors:
Intel USB role switch driver depends on the USB role switch
class as it should, but since there was no way for the user
to enable the USB role switch class, there was also no way
to select that driver. USB Type-C drivers select the USB
role switch class which makes the Intel USB role switch
driver available and therefore hides the problem.
So in practice Intel USB role switch driver was depending on
USB Type-C drivers.
Fixes: f6fb9ec02be1 ("usb: roles: Add Intel xHCI USB role switch driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/Kconfig | 4 ----
drivers/usb/common/Makefile | 1 -
drivers/usb/roles/Kconfig | 13 +++++++++++++
drivers/usb/roles/Makefile | 4 +++-
drivers/usb/{common/roles.c => roles/class.c} | 0
5 files changed, 16 insertions(+), 6 deletions(-)
rename drivers/usb/{common/roles.c => roles/class.c} (100%)
diff --git a/drivers/usb/Kconfig b/drivers/usb/Kconfig
index 987fc5ba6321..70e6c956c23c 100644
--- a/drivers/usb/Kconfig
+++ b/drivers/usb/Kconfig
@@ -205,8 +205,4 @@ config USB_ULPI_BUS
To compile this driver as a module, choose M here: the module will
be called ulpi.
-config USB_ROLE_SWITCH
- tristate
- select USB_COMMON
-
endif # USB_SUPPORT
diff --git a/drivers/usb/common/Makefile b/drivers/usb/common/Makefile
index fb4d5ef4165c..0a7c45e85481 100644
--- a/drivers/usb/common/Makefile
+++ b/drivers/usb/common/Makefile
@@ -9,4 +9,3 @@ usb-common-$(CONFIG_USB_LED_TRIG) += led.o
obj-$(CONFIG_USB_OTG_FSM) += usb-otg-fsm.o
obj-$(CONFIG_USB_ULPI_BUS) += ulpi.o
-obj-$(CONFIG_USB_ROLE_SWITCH) += roles.o
diff --git a/drivers/usb/roles/Kconfig b/drivers/usb/roles/Kconfig
index f5a5e6f79f1b..e4194ac94510 100644
--- a/drivers/usb/roles/Kconfig
+++ b/drivers/usb/roles/Kconfig
@@ -1,3 +1,16 @@
+config USB_ROLE_SWITCH
+ tristate "USB Role Switch Support"
+ help
+ USB Role Switch is a device that can select the USB role - host or
+ device - for a USB port (connector). In most cases dual-role capable
+ USB controller will also represent the switch, but on some platforms
+ multiplexer/demultiplexer switch is used to route the data lines on
+ the USB connector between separate USB host and device controllers.
+
+ Say Y here if your USB connectors support both device and host roles.
+ To compile the driver as module, choose M here: the module will be
+ called roles.ko.
+
if USB_ROLE_SWITCH
config USB_ROLES_INTEL_XHCI
diff --git a/drivers/usb/roles/Makefile b/drivers/usb/roles/Makefile
index e44b179ba275..c02873206fc1 100644
--- a/drivers/usb/roles/Makefile
+++ b/drivers/usb/roles/Makefile
@@ -1 +1,3 @@
-obj-$(CONFIG_USB_ROLES_INTEL_XHCI) += intel-xhci-usb-role-switch.o
+obj-$(CONFIG_USB_ROLE_SWITCH) += roles.o
+roles-y := class.o
+obj-$(CONFIG_USB_ROLES_INTEL_XHCI) += intel-xhci-usb-role-switch.o
diff --git a/drivers/usb/common/roles.c b/drivers/usb/roles/class.c
similarity index 100%
rename from drivers/usb/common/roles.c
rename to drivers/usb/roles/class.c
--
2.20.1
On 12/14/18 7:55 AM, Ross Lagerwall wrote:
> If pcistub_init_device fails, the release function will be called with
> dev_data set to NULL. Check it before using it to avoid a NULL pointer
> dereference.
>
> Signed-off-by: Ross Lagerwall <ross.lagerwall(a)citrix.com>
I think this should go to stable trees too (copying them)
Applied to for-linus-4.21
-boris
> ---
> drivers/xen/xen-pciback/pci_stub.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
> index 59661db144e5..097410a7cdb7 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -106,7 +106,8 @@ static void pcistub_device_release(struct kref *kref)
> * is called from "unbind" which takes a device_lock mutex.
> */
> __pci_reset_function_locked(dev);
> - if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> + if (dev_data &&
> + pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> dev_info(&dev->dev, "Could not reload PCI state\n");
> else
> pci_restore_state(dev);
This is a note to let you know that I've just added the patch titled
Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 3c9dc275dba1124c1e16e7037226038451286813 Mon Sep 17 00:00:00 2001
From: Paul Burton <paul.burton(a)mips.com>
Date: Sun, 16 Dec 2018 20:10:01 +0000
Subject: Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"
Commit f6aa5beb45be ("serial: 8250: Fix clearing FIFOs in RS485 mode
again") makes a change to FIFO clearing code which its commit message
suggests was intended to be specific to use with RS485 mode, however:
1) The change made does not just affect __do_stop_tx_rs485(), it also
affects other uses of serial8250_clear_fifos() including paths for
starting up, shutting down or auto-configuring a port regardless of
whether it's an RS485 port or not.
2) It makes the assumption that resetting the FIFOs is a no-op when
FIFOs are disabled, and as such it checks for this case & explicitly
avoids setting the FIFO reset bits when the FIFO enable bit is
clear. A reading of the PC16550D manual would suggest that this is
OK since the FIFO should automatically be reset if it is later
enabled, but we support many 16550-compatible devices and have never
required this auto-reset behaviour for at least the whole git era.
Starting to rely on it now seems risky, offers no benefit, and
indeed breaks at least the Ingenic JZ4780's UARTs which reads
garbage when the RX FIFO is enabled if we don't explicitly reset it.
3) By only resetting the FIFOs if they're enabled, the behaviour of
serial8250_do_startup() during boot now depends on what the value of
FCR is before the 8250 driver is probed. This in itself seems
questionable and leaves us with FCR=0 & no FIFO reset if the UART
was used by 8250_early, otherwise it depends upon what the
bootloader left behind.
4) Although the naming of serial8250_clear_fifos() may be unclear, it
is clear that callers of it expect that it will disable FIFOs. Both
serial8250_do_startup() & serial8250_do_shutdown() contain comments
to that effect, and other callers explicitly re-enable the FIFOs
after calling serial8250_clear_fifos(). The premise of that patch
that disabling the FIFOs is incorrect therefore seems wrong.
For these reasons, this reverts commit f6aa5beb45be ("serial: 8250: Fix
clearing FIFOs in RS485 mode again").
Signed-off-by: Paul Burton <paul.burton(a)mips.com>
Fixes: f6aa5beb45be ("serial: 8250: Fix clearing FIFOs in RS485 mode again").
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Daniel Jedrychowski <avistel(a)gmail.com>
Cc: Marek Vasut <marex(a)denx.de>
Cc: linux-mips(a)vger.kernel.org
Cc: linux-serial(a)vger.kernel.org
Cc: stable <stable(a)vger.kernel.org> # 4.10+
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/8250/8250_port.c | 29 +++++------------------------
1 file changed, 5 insertions(+), 24 deletions(-)
diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index f776b3eafb96..3f779d25ec0c 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -552,30 +552,11 @@ static unsigned int serial_icr_read(struct uart_8250_port *up, int offset)
*/
static void serial8250_clear_fifos(struct uart_8250_port *p)
{
- unsigned char fcr;
- unsigned char clr_mask = UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT;
-
if (p->capabilities & UART_CAP_FIFO) {
- /*
- * Make sure to avoid changing FCR[7:3] and ENABLE_FIFO bits.
- * In case ENABLE_FIFO is not set, there is nothing to flush
- * so just return. Furthermore, on certain implementations of
- * the 8250 core, the FCR[7:3] bits may only be changed under
- * specific conditions and changing them if those conditions
- * are not met can have nasty side effects. One such core is
- * the 8250-omap present in TI AM335x.
- */
- fcr = serial_in(p, UART_FCR);
-
- /* FIFO is not enabled, there's nothing to clear. */
- if (!(fcr & UART_FCR_ENABLE_FIFO))
- return;
-
- fcr |= clr_mask;
- serial_out(p, UART_FCR, fcr);
-
- fcr &= ~clr_mask;
- serial_out(p, UART_FCR, fcr);
+ serial_out(p, UART_FCR, UART_FCR_ENABLE_FIFO);
+ serial_out(p, UART_FCR, UART_FCR_ENABLE_FIFO |
+ UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT);
+ serial_out(p, UART_FCR, 0);
}
}
@@ -1467,7 +1448,7 @@ static void __do_stop_tx_rs485(struct uart_8250_port *p)
* Enable previously disabled RX interrupts.
*/
if (!(p->port.rs485.flags & SER_RS485_RX_DURING_TX)) {
- serial8250_clear_fifos(p);
+ serial8250_clear_and_reinit_fifos(p);
p->ier |= UART_IER_RLSI | UART_IER_RDI;
serial_port_out(&p->port, UART_IER, p->ier);
--
2.20.1
This is a note to let you know that I've just added the patch titled
USB: xhci: fix 'broken_suspend' placement in struct xchi_hcd
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 2419f30a4a4fcaa5f35111563b4c61f1b2b26841 Mon Sep 17 00:00:00 2001
From: Nicolas Saenz Julienne <nsaenzjulienne(a)suse.de>
Date: Mon, 17 Dec 2018 14:37:40 +0100
Subject: USB: xhci: fix 'broken_suspend' placement in struct xchi_hcd
As commented in the struct's definition there shouldn't be anything
underneath its 'priv[0]' member as it would break some macros.
The patch converts the broken_suspend into a bit-field and relocates it
next to to the rest of bit-fields.
Fixes: a7d57abcc8a5 ("xhci: workaround CSS timeout on AMD SNPS 3.0 xHC")
Reported-by: Oliver Neukum <oneukum(a)suse.com>
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne(a)suse.de>
Acked-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index c3515bad5dbb..011dd45f8718 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1863,6 +1863,8 @@ struct xhci_hcd {
unsigned sw_lpm_support:1;
/* support xHCI 1.0 spec USB2 hardware LPM */
unsigned hw_lpm_support:1;
+ /* Broken Suspend flag for SNPS Suspend resume issue */
+ unsigned broken_suspend:1;
/* cached usb2 extened protocol capabilites */
u32 *ext_caps;
unsigned int num_ext_caps;
@@ -1880,8 +1882,6 @@ struct xhci_hcd {
void *dbc;
/* platform-specific data -- must come last */
unsigned long priv[0] __aligned(sizeof(s64));
- /* Broken Suspend flag for SNPS Suspend resume issue */
- u8 broken_suspend;
};
/* Platform specific overrides to generic XHCI hc_driver ops */
--
2.20.1
The gpio IP on Armada 370 at offset 0x18180 has neither a clk nor pwm
registers. So there is no need for a clk as the pwm isn't used anyhow.
So only check for the clk in the presence of the pwm registers. This fixes
a failure to probe the gpio driver for the above mentioned gpio device.
Fixes: 757642f9a584 ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
---
drivers/gpio/gpio-mvebu.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpio/gpio-mvebu.c b/drivers/gpio/gpio-mvebu.c
index 6e02148c208b..adc768f908f1 100644
--- a/drivers/gpio/gpio-mvebu.c
+++ b/drivers/gpio/gpio-mvebu.c
@@ -773,9 +773,6 @@ static int mvebu_pwm_probe(struct platform_device *pdev,
"marvell,armada-370-gpio"))
return 0;
- if (IS_ERR(mvchip->clk))
- return PTR_ERR(mvchip->clk);
-
/*
* There are only two sets of PWM configuration registers for
* all the GPIO lines on those SoCs which this driver reserves
@@ -786,6 +783,9 @@ static int mvebu_pwm_probe(struct platform_device *pdev,
if (!res)
return 0;
+ if (IS_ERR(mvchip->clk))
+ return PTR_ERR(mvchip->clk);
+
/*
* Use set A for lines of GPIO chip with id 0, B for GPIO chip
* with id 1. Don't allow further GPIO chips to be used for PWM.
--
2.19.2
When jffs2 has to retry reading xattrs we need to reset
the buffer pointer. Otherwise we return old xattrs from the
previous iteration which leads to a inconsistency between
the number of bytes we return and the real list size.
Cc: <stable(a)vger.kernel.org>
Cc: Andreas Gruenbacher <agruenba(a)redhat.com>
Fixes: 764a5c6b1fa4 ("xattr handlers: Simplify list operation")
Signed-off-by: Richard Weinberger <richard(a)nod.at>
---
Andreas,
since you maintain the attr package too, I report it right here. :-)
This jffs2 bug lead to a crash in attr_list().
for() will loop to crash when there is no trailing \0 in the
list of xattrs.
for (l = lbuf; l != lbuf + length; l = strchr(l, '\0') + 1) {
if (api_unconvert(name, l, flags))
continue;
...
}
I suggest changing the loop condition to something like l < lbuf + length.
Thanks,
//richard
---
fs/jffs2/xattr.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/jffs2/xattr.c b/fs/jffs2/xattr.c
index da3e18503c65..0cb322eb9516 100644
--- a/fs/jffs2/xattr.c
+++ b/fs/jffs2/xattr.c
@@ -967,6 +967,7 @@ ssize_t jffs2_listxattr(struct dentry *dentry, char *buffer, size_t size)
struct jffs2_xattr_ref *ref, **pref;
struct jffs2_xattr_datum *xd;
const struct xattr_handler *xhandle;
+ char *orig_buffer = buffer;
const char *prefix;
ssize_t prefix_len, len, rc;
int retry = 0;
@@ -977,6 +978,7 @@ ssize_t jffs2_listxattr(struct dentry *dentry, char *buffer, size_t size)
down_read(&c->xattr_sem);
retry:
+ buffer = orig_buffer;
len = 0;
for (ref=ic->xref, pref=&ic->xref; ref; pref=&ref->next, ref=ref->next) {
BUG_ON(ref->ic != ic);
--
2.20.0
This is a note to let you know that I've just added the patch titled
staging: bcm2835-audio: double free in init error path
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the staging-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 649496b603000135683ee76d7ea499456617bf17 Mon Sep 17 00:00:00 2001
From: Dan Carpenter <dan.carpenter(a)oracle.com>
Date: Mon, 17 Dec 2018 10:08:54 +0300
Subject: staging: bcm2835-audio: double free in init error path
We free instance here and in the caller. It should be only the caller
which handles it.
Fixes: d7ca3a71545b ("staging: bcm2835-audio: Operate non-atomic PCM ops")
Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Reviewed-by: Takashi Iwai <tiwai(a)suse.de>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
index 0db412fd7c55..c0debdbce26c 100644
--- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
+++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c
@@ -138,7 +138,6 @@ vc_vchi_audio_init(VCHI_INSTANCE_T vchi_instance,
dev_err(instance->dev,
"failed to open VCHI service connection (status=%d)\n",
status);
- kfree(instance);
return -EPERM;
}
--
2.20.1