The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ea7e0480a4b695d0aa6b3fa99bd658a003122113 Mon Sep 17 00:00:00 2001
From: Paul Burton <paul.burton(a)mips.com>
Date: Tue, 25 Sep 2018 15:51:26 -0700
Subject: [PATCH] MIPS: VDSO: Always map near top of user memory
When using the legacy mmap layout, for example triggered using ulimit -s
unlimited, get_unmapped_area() fills memory from bottom to top starting
from a fairly low address near TASK_UNMAPPED_BASE.
This placement is suboptimal if the user application wishes to allocate
large amounts of heap memory using the brk syscall. With the VDSO being
located low in the user's virtual address space, the amount of space
available for access using brk is limited much more than it was prior to
the introduction of the VDSO.
For example:
# ulimit -s unlimited; cat /proc/self/maps
00400000-004ec000 r-xp 00000000 08:00 71436 /usr/bin/coreutils
004fc000-004fd000 rwxp 000ec000 08:00 71436 /usr/bin/coreutils
004fd000-0050f000 rwxp 00000000 00:00 0
00cc3000-00ce4000 rwxp 00000000 00:00 0 [heap]
2ab96000-2ab98000 r--p 00000000 00:00 0 [vvar]
2ab98000-2ab99000 r-xp 00000000 00:00 0 [vdso]
2ab99000-2ab9d000 rwxp 00000000 00:00 0
...
Resolve this by adjusting STACK_TOP to reserve space for the VDSO &
providing an address hint to get_unmapped_area() causing it to use this
space even when using the legacy mmap layout.
We reserve enough space for the VDSO, plus 1MB or 256MB for 32 bit & 64
bit systems respectively within which we randomize the VDSO base
address. Previously this randomization was taken care of by the mmap
base address randomization performed by arch_mmap_rnd(). The 1MB & 256MB
sizes are somewhat arbitrary but chosen such that we have some
randomization without taking up too much of the user's virtual address
space, which is often in short supply for 32 bit systems.
With this the VDSO is always mapped at a high address, leaving lots of
space for statically linked programs to make use of brk:
# ulimit -s unlimited; cat /proc/self/maps
00400000-004ec000 r-xp 00000000 08:00 71436 /usr/bin/coreutils
004fc000-004fd000 rwxp 000ec000 08:00 71436 /usr/bin/coreutils
004fd000-0050f000 rwxp 00000000 00:00 0
00c28000-00c49000 rwxp 00000000 00:00 0 [heap]
...
7f67c000-7f69d000 rwxp 00000000 00:00 0 [stack]
7f7fc000-7f7fd000 rwxp 00000000 00:00 0
7fcf1000-7fcf3000 r--p 00000000 00:00 0 [vvar]
7fcf3000-7fcf4000 r-xp 00000000 00:00 0 [vdso]
Signed-off-by: Paul Burton <paul.burton(a)mips.com>
Reported-by: Huacai Chen <chenhc(a)lemote.com>
Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO")
Cc: Huacai Chen <chenhc(a)lemote.com>
Cc: linux-mips(a)linux-mips.org
Cc: stable(a)vger.kernel.org # v4.4+
diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
index b2fa62922d88..49d6046ca1d0 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -13,6 +13,7 @@
#include <linux/atomic.h>
#include <linux/cpumask.h>
+#include <linux/sizes.h>
#include <linux/threads.h>
#include <asm/cachectl.h>
@@ -80,11 +81,10 @@ extern unsigned int vced_count, vcei_count;
#endif
-/*
- * One page above the stack is used for branch delay slot "emulation".
- * See dsemul.c for details.
- */
-#define STACK_TOP ((TASK_SIZE & PAGE_MASK) - PAGE_SIZE)
+#define VDSO_RANDOMIZE_SIZE (TASK_IS_32BIT_ADDR ? SZ_1M : SZ_256M)
+
+extern unsigned long mips_stack_top(void);
+#define STACK_TOP mips_stack_top()
/*
* This decides where the kernel will search for a free chunk of vm
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index 8fc69891e117..d4f7fd4550e1 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -32,6 +32,7 @@
#include <linux/nmi.h>
#include <linux/cpu.h>
+#include <asm/abi.h>
#include <asm/asm.h>
#include <asm/bootinfo.h>
#include <asm/cpu.h>
@@ -39,6 +40,7 @@
#include <asm/dsp.h>
#include <asm/fpu.h>
#include <asm/irq.h>
+#include <asm/mips-cps.h>
#include <asm/msa.h>
#include <asm/pgtable.h>
#include <asm/mipsregs.h>
@@ -645,6 +647,29 @@ unsigned long get_wchan(struct task_struct *task)
return pc;
}
+unsigned long mips_stack_top(void)
+{
+ unsigned long top = TASK_SIZE & PAGE_MASK;
+
+ /* One page for branch delay slot "emulation" */
+ top -= PAGE_SIZE;
+
+ /* Space for the VDSO, data page & GIC user page */
+ top -= PAGE_ALIGN(current->thread.abi->vdso->size);
+ top -= PAGE_SIZE;
+ top -= mips_gic_present() ? PAGE_SIZE : 0;
+
+ /* Space for cache colour alignment */
+ if (cpu_has_dc_aliases)
+ top -= shm_align_mask + 1;
+
+ /* Space to randomize the VDSO base */
+ if (current->flags & PF_RANDOMIZE)
+ top -= VDSO_RANDOMIZE_SIZE;
+
+ return top;
+}
+
/*
* Don't forget that the stack pointer must be aligned on a 8 bytes
* boundary for 32-bits ABI and 16 bytes for 64-bits ABI.
diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index 8f845f6e5f42..48a9c6b90e07 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -15,6 +15,7 @@
#include <linux/ioport.h>
#include <linux/kernel.h>
#include <linux/mm.h>
+#include <linux/random.h>
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/timekeeper_internal.h>
@@ -97,6 +98,21 @@ void update_vsyscall_tz(void)
}
}
+static unsigned long vdso_base(void)
+{
+ unsigned long base;
+
+ /* Skip the delay slot emulation page */
+ base = STACK_TOP + PAGE_SIZE;
+
+ if (current->flags & PF_RANDOMIZE) {
+ base += get_random_int() & (VDSO_RANDOMIZE_SIZE - 1);
+ base = PAGE_ALIGN(base);
+ }
+
+ return base;
+}
+
int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
{
struct mips_vdso_image *image = current->thread.abi->vdso;
@@ -137,7 +153,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
if (cpu_has_dc_aliases)
size += shm_align_mask + 1;
- base = get_unmapped_area(NULL, 0, size, 0, 0);
+ base = get_unmapped_area(NULL, vdso_base(), size, 0, 0);
if (IS_ERR_VALUE(base)) {
ret = base;
goto out;
I guess I should put more stable backport tags on stuff, I've been
bugged about not having requested a backport for the following:
commit 28e2c4bb99aa ("mm/vmstat.c: fix outdated vmstat_text"), for
stable kernels 4.14 and 4.18.
Unfortunately, it appears our fix in:
commit b5d29843d8ef ("drm/atomic_helper: Allow DPMS On<->Off changes
for unregistered connectors")
Which attempted to work around the problems introduced by:
commit 4d80273976bf ("drm/atomic_helper: Disallow new modesets on
unregistered connectors")
Is still not the right solution, as modesets can still be triggered
outside of drm_atomic_set_crtc_for_connector().
So in order to fix this, while still being careful that we don't break
modesets that a driver may perform before being registered with
userspace, we replace connector->registered with a tristate member,
connector->registration_state. This allows us to keep track of whether
or not a connector is still initializing and hasn't been exposed to
userspace, is currently registered and exposed to userspace, or has been
legitimately removed from the system after having once been present.
Using this info, we can prevent userspace from performing new modesets
on unregistered connectors while still allowing the driver to perform
modesets on unregistered connectors before the driver has finished being
registered.
Fixes: b5d29843d8ef ("drm/atomic_helper: Allow DPMS On<->Off changes for unregistered connectors")
Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: stable(a)vger.kernel.org
Cc: David Airlie <airlied(a)linux.ie>
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
---
drivers/gpu/drm/drm_atomic_helper.c | 60 +++++++++++++++++++++----
drivers/gpu/drm/drm_atomic_uapi.c | 21 ---------
drivers/gpu/drm/drm_connector.c | 10 ++---
drivers/gpu/drm/i915/intel_dp_mst.c | 8 ++--
include/drm/drm_connector.h | 68 ++++++++++++++++++++++++++++-
5 files changed, 127 insertions(+), 40 deletions(-)
diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 6f66777dca4b..6cadeaf28ae4 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -529,6 +529,35 @@ mode_valid(struct drm_atomic_state *state)
return 0;
}
+static int
+unregistered_connector_check(struct drm_atomic_state *state,
+ struct drm_connector *connector,
+ struct drm_connector_state *old_conn_state,
+ struct drm_connector_state *new_conn_state)
+{
+ struct drm_crtc_state *crtc_state;
+ struct drm_crtc *crtc;
+
+ if (!drm_connector_unregistered(connector))
+ return 0;
+
+ crtc = new_conn_state->crtc;
+ if (!crtc)
+ return 0;
+
+ crtc_state = drm_atomic_get_new_crtc_state(state, crtc);
+ if (!crtc_state || !drm_atomic_crtc_needs_modeset(crtc_state))
+ return 0;
+
+ if (crtc_state->mode_changed) {
+ DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] can't change mode on unregistered connector\n",
+ connector->base.id, connector->name);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
/**
* drm_atomic_helper_check_modeset - validate state object for modeset changes
* @dev: DRM device
@@ -684,18 +713,33 @@ drm_atomic_helper_check_modeset(struct drm_device *dev,
return ret;
}
- /*
- * Iterate over all connectors again, to make sure atomic_check()
- * has been called on them when a modeset is forced.
- */
for_each_oldnew_connector_in_state(state, connector, old_connector_state, new_connector_state, i) {
const struct drm_connector_helper_funcs *funcs = connector->helper_private;
- if (connectors_mask & BIT(i))
- continue;
+ /* Make sure atomic_check() is called on any unchecked
+ * connectors when a modeset has been forced
+ */
+ if (connectors_mask & BIT(i) && funcs->atomic_check) {
+ ret = funcs->atomic_check(connector,
+ new_connector_state);
+ if (ret)
+ return ret;
+ }
- if (funcs->atomic_check)
- ret = funcs->atomic_check(connector, new_connector_state);
+ /*
+ * Prevent userspace from turning on new displays or setting
+ * new modes using connectors which have been removed from
+ * userspace. This is racy since an unplug could happen at any
+ * time including after this check, but that's OK: we only
+ * care about preventing userspace from trying to set invalid
+ * state using destroyed connectors that it's been notified
+ * about. No one can save us after the atomic check completes
+ * but ourselves.
+ */
+ ret = unregistered_connector_check(state,
+ connector,
+ old_connector_state,
+ new_connector_state);
if (ret)
return ret;
}
diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c
index a22d6f269b07..d5b7f315098c 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -299,27 +299,6 @@ drm_atomic_set_crtc_for_connector(struct drm_connector_state *conn_state,
struct drm_connector *connector = conn_state->connector;
struct drm_crtc_state *crtc_state;
- /*
- * For compatibility with legacy users, we want to make sure that
- * we allow DPMS On<->Off modesets on unregistered connectors, since
- * legacy modesetting users will not be expecting these to fail. We do
- * not however, want to allow legacy users to assign a connector
- * that's been unregistered from sysfs to another CRTC, since doing
- * this with a now non-existent connector could potentially leave us
- * in an invalid state.
- *
- * Since the connector can be unregistered at any point during an
- * atomic check or commit, this is racy. But that's OK: all we care
- * about is ensuring that userspace can't use this connector for new
- * configurations after it's been notified that the connector is no
- * longer present.
- */
- if (!READ_ONCE(connector->registered) && crtc) {
- DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] is not registered\n",
- connector->base.id, connector->name);
- return -EINVAL;
- }
-
if (conn_state->crtc == crtc)
return 0;
diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 5d01414ec9f7..79943102fe18 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -396,7 +396,7 @@ void drm_connector_cleanup(struct drm_connector *connector)
/* The connector should have been removed from userspace long before
* it is finally destroyed.
*/
- if (WARN_ON(connector->registered))
+ if (WARN_ON(!drm_connector_unregistered(connector)))
drm_connector_unregister(connector);
if (connector->tile_group) {
@@ -453,7 +453,7 @@ int drm_connector_register(struct drm_connector *connector)
return 0;
mutex_lock(&connector->mutex);
- if (connector->registered)
+ if (connector->registration_state != DRM_CONNECTOR_INITIALIZING)
goto unlock;
ret = drm_sysfs_connector_add(connector);
@@ -473,7 +473,7 @@ int drm_connector_register(struct drm_connector *connector)
drm_mode_object_register(connector->dev, &connector->base);
- connector->registered = true;
+ connector->registration_state = DRM_CONNECTOR_REGISTERED;
goto unlock;
err_debugfs:
@@ -495,7 +495,7 @@ EXPORT_SYMBOL(drm_connector_register);
void drm_connector_unregister(struct drm_connector *connector)
{
mutex_lock(&connector->mutex);
- if (!connector->registered) {
+ if (connector->registration_state != DRM_CONNECTOR_REGISTERED) {
mutex_unlock(&connector->mutex);
return;
}
@@ -506,7 +506,7 @@ void drm_connector_unregister(struct drm_connector *connector)
drm_sysfs_connector_remove(connector);
drm_debugfs_connector_remove(connector);
- connector->registered = false;
+ connector->registration_state = DRM_CONNECTOR_UNREGISTERED;
mutex_unlock(&connector->mutex);
}
EXPORT_SYMBOL(drm_connector_unregister);
diff --git a/drivers/gpu/drm/i915/intel_dp_mst.c b/drivers/gpu/drm/i915/intel_dp_mst.c
index b268bdd71bd3..ad367309e10b 100644
--- a/drivers/gpu/drm/i915/intel_dp_mst.c
+++ b/drivers/gpu/drm/i915/intel_dp_mst.c
@@ -78,7 +78,7 @@ static bool intel_dp_mst_compute_config(struct intel_encoder *encoder,
pipe_config->pbn = mst_pbn;
/* Zombie connectors can't have VCPI slots */
- if (READ_ONCE(connector->registered)) {
+ if (!drm_connector_unregistered(connector)) {
slots = drm_dp_atomic_find_vcpi_slots(state,
&intel_dp->mst_mgr,
port,
@@ -314,7 +314,7 @@ static int intel_dp_mst_get_ddc_modes(struct drm_connector *connector)
struct edid *edid;
int ret;
- if (!READ_ONCE(connector->registered))
+ if (drm_connector_unregistered(connector))
return intel_connector_update_modes(connector, NULL);
edid = drm_dp_mst_get_edid(connector, &intel_dp->mst_mgr, intel_connector->port);
@@ -330,7 +330,7 @@ intel_dp_mst_detect(struct drm_connector *connector, bool force)
struct intel_connector *intel_connector = to_intel_connector(connector);
struct intel_dp *intel_dp = intel_connector->mst_port;
- if (!READ_ONCE(connector->registered))
+ if (drm_connector_unregistered(connector))
return connector_status_disconnected;
return drm_dp_mst_detect_port(connector, &intel_dp->mst_mgr,
intel_connector->port);
@@ -361,7 +361,7 @@ intel_dp_mst_mode_valid(struct drm_connector *connector,
int bpp = 24; /* MST uses fixed bpp */
int max_rate, mode_rate, max_lanes, max_link_clock;
- if (!READ_ONCE(connector->registered))
+ if (drm_connector_unregistered(connector))
return MODE_ERROR;
if (mode->flags & DRM_MODE_FLAG_DBLSCAN)
diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
index 5b3cf909fd5e..5f3e4a37bcd2 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -82,6 +82,51 @@ enum drm_connector_status {
connector_status_unknown = 3,
};
+/**
+ * enum drm_connector_registration_status - userspace registration status for
+ * a &drm_connector
+ *
+ * This enum is used to track the status of initializing a connector and
+ * registering it with userspace, so that DRM can prevent bogus modesets on
+ * connectors that no longer exist.
+ */
+enum drm_connector_registration_state {
+ /**
+ * @DRM_CONNECTOR_INITIALIZING: The connector has just been created,
+ * but has yet to be exposed to userspace. There should be no
+ * additional restrictions to how the state of this connector may be
+ * modified. connector to a new CRTC.
+ */
+ DRM_CONNECTOR_INITIALIZING = 0,
+
+ /**
+ * @DRM_CONNECTOR_REGISTERED: The connector has been fully initialized
+ * and registered with sysfs, as such it has been exposed to
+ * userspace. There should be no additional restrictions to how the
+ * state of this connector may be modified.
+ */
+ DRM_CONNECTOR_REGISTERED = 1,
+
+ /**
+ * @DRM_CONNECTOR_UNREGISTERED: The connector has either been exposed
+ * to userspace and has since been unregistered and removed from
+ * userspace, or the connector was destroyed before it had a chance to
+ * be exposed to userspace. There are additional restrictions to how
+ * the state of an unregistered connector may be modified:
+ *
+ * - The current display mode as exposed to userspace must remain the
+ * same as it was when the connector was unregistered.
+ * - The connector's currently assigned CRTC may be unassigned from
+ * this connector. Unassigning the connector's CRTC must be
+ * permanent.
+ * - New CRTCs must not be assigned to this connector.
+ * - The DPMS state of the connector (for compatibility with legacy
+ * modesetting) may modified freely, so long as doing so does not
+ * cause the new atomic state to violate any of these rules.
+ */
+ DRM_CONNECTOR_UNREGISTERED = 2,
+};
+
enum subpixel_order {
SubPixelUnknown = 0,
SubPixelHorizontalRGB,
@@ -853,10 +898,12 @@ struct drm_connector {
bool ycbcr_420_allowed;
/**
- * @registered: Is this connector exposed (registered) with userspace?
+ * @registration_state: Is this connector initializing, exposed
+ * (registered) with userspace, or unregistered?
+ *
* Protected by @mutex.
*/
- bool registered;
+ enum drm_connector_registration_state registration_state;
/**
* @modes:
@@ -1167,6 +1214,23 @@ static inline void drm_connector_unreference(struct drm_connector *connector)
drm_connector_put(connector);
}
+/**
+ * drm_connector_unregistered - has the connector been unregistered from
+ * userspace?
+ * @connector: DRM connector
+ *
+ * Checks whether or not @connector has been unregistered from userspace.
+ *
+ * Returns:
+ * True if the connector was unregistered, false if the connector is
+ * registered or has not yet been registered with userspace.
+ */
+static inline bool drm_connector_unregistered(struct drm_connector *connector)
+{
+ return READ_ONCE(connector->registration_state) ==
+ DRM_CONNECTOR_UNREGISTERED;
+}
+
const char *drm_get_connector_status_name(enum drm_connector_status status);
const char *drm_get_subpixel_order_name(enum subpixel_order order);
const char *drm_get_dpms_name(int val);
--
2.17.2
This is a note to let you know that I've just added the patch titled
USB: fix the usbfs flag sanitization for control transfers
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 665c365a77fbfeabe52694aedf3446d5f2f1ce42 Mon Sep 17 00:00:00 2001
From: Alan Stern <stern(a)rowland.harvard.edu>
Date: Mon, 15 Oct 2018 16:55:04 -0400
Subject: USB: fix the usbfs flag sanitization for control transfers
Commit 7a68d9fb8510 ("USB: usbdevfs: sanitize flags more") checks the
transfer flags for URBs submitted from userspace via usbfs. However,
the check for whether the USBDEVFS_URB_SHORT_NOT_OK flag should be
allowed for a control transfer was added in the wrong place, before
the code has properly determined the direction of the control
transfer. (Control transfers are special because for them, the
direction is set by the bRequestType byte of the Setup packet rather
than direction bit of the endpoint address.)
This patch moves code which sets up the allow_short flag for control
transfers down after is_in has been set to the correct value.
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
Reported-and-tested-by: syzbot+24a30223a4b609bb802e(a)syzkaller.appspotmail.com
Fixes: 7a68d9fb8510 ("USB: usbdevfs: sanitize flags more")
CC: Oliver Neukum <oneukum(a)suse.com>
CC: <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/devio.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 244417d0dfd1..ffccd40ea67d 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -1474,8 +1474,6 @@ static int proc_do_submiturb(struct usb_dev_state *ps, struct usbdevfs_urb *uurb
u = 0;
switch (uurb->type) {
case USBDEVFS_URB_TYPE_CONTROL:
- if (is_in)
- allow_short = true;
if (!usb_endpoint_xfer_control(&ep->desc))
return -EINVAL;
/* min 8 byte setup packet */
@@ -1505,6 +1503,8 @@ static int proc_do_submiturb(struct usb_dev_state *ps, struct usbdevfs_urb *uurb
is_in = 0;
uurb->endpoint &= ~USB_DIR_IN;
}
+ if (is_in)
+ allow_short = true;
snoop(&ps->dev->dev, "control urb: bRequestType=%02x "
"bRequest=%02x wValue=%04x "
"wIndex=%04x wLength=%04x\n",
--
2.19.1
The patch titled
Subject: mm: /proc/pid/smaps_rollup: fix NULL pointer deref in smaps_pte_range()
has been added to the -mm tree. Its filename is
mm-proc-pid-smaps_rollup-fix-null-pointer-deref-in-smaps_pte_range.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-proc-pid-smaps_rollup-fix-null-…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-proc-pid-smaps_rollup-fix-null-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Vlastimil Babka <vbabka(a)suse.cz>
Subject: mm: /proc/pid/smaps_rollup: fix NULL pointer deref in smaps_pte_range()
Leonardo reports an apparent regression in 4.19-rc7:
BUG: unable to handle kernel NULL pointer dereference at 00000000000000f0
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 3 PID: 6032 Comm: python Not tainted 4.19.0-041900rc7-lowlatency #201810071631
Hardware name: LENOVO 80UG/Toronto 4A2, BIOS 0XCN45WW 08/09/2018
RIP: 0010:smaps_pte_range+0x32d/0x540
Code: 80 00 00 00 00 74 a9 48 89 de 41 f6 40 52 40 0f 85 04 02 00 00 49 2b 30 48 c1 ee 0c 49 03 b0 98 00 00 00 49 8b 80 a0 00 00 00 <48> 8b b8 f0 00 00 00 e8 b7 ef ec ff 48 85 c0 0f 84 71 ff ff ff a8
RSP: 0018:ffffb0cbc484fb88 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000560ddb9e9000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000560ddb9e9 RDI: 0000000000000001
RBP: ffffb0cbc484fbc0 R08: ffff94a5a227a578 R09: ffff94a5a227a578
R10: 0000000000000000 R11: 0000560ddbbe7000 R12: ffffe903098ba728
R13: ffffb0cbc484fc78 R14: ffffb0cbc484fcf8 R15: ffff94a5a2e9cf48
FS: 00007f6dfb683740(0000) GS:ffff94a5aaf80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000f0 CR3: 000000011c118001 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__walk_page_range+0x3c2/0x6f0
walk_page_vma+0x42/0x60
smap_gather_stats+0x79/0xe0
? gather_pte_stats+0x320/0x320
? gather_hugetlb_stats+0x70/0x70
show_smaps_rollup+0xcd/0x1c0
seq_read+0x157/0x400
__vfs_read+0x3a/0x180
? security_file_permission+0x93/0xc0
? security_file_permission+0x93/0xc0
vfs_read+0x8f/0x140
ksys_read+0x55/0xc0
__x64_sys_read+0x1a/0x20
do_syscall_64+0x5a/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Decoded code matched to local compilation+disassembly points to
smaps_pte_entry():
} else if (unlikely(IS_ENABLED(CONFIG_SHMEM) && mss->check_shmem_swap
&& pte_none(*pte))) {
page = find_get_entry(vma->vm_file->f_mapping,
linear_page_index(vma, addr));
Here, vma->vm_file is NULL. mss->check_shmem_swap should be false in that
case, however for smaps_rollup, smap_gather_stats() can set the flag true
for one vma and leave it true for subsequent vma's where it should be
false.
To fix, reset the check_shmem_swap flag to false. There's also related
bug which sets mss->swap to shmem_swapped, which in the context of
smaps_rollup overwrites any value accumulated from previous vma's. Fix
that as well.
Note that the report suggests a regression between 4.17.19 and 4.19-rc7,
which makes the 4.19 series ending with commit 258f669e7e88 ("mm:
/proc/pid/smaps_rollup: convert to single value seq_file") suspicious.
But the mss was reused for rollup since 493b0e9d945f ("mm: add
/proc/pid/smaps_rollup") so let's play it safe with the stable backport.
Link: http://lkml.kernel.org/r/555fbd1f-4ac9-0b58-dcd4-5dc4380ff7ca@suse.cz
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201377
Fixes: 493b0e9d945f ("mm: add /proc/pid/smaps_rollup")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Reported-by: Leonardo Soares Mller <leozinho29_eu(a)hotmail.com>
Tested-by: Leonardo Soares Mller <leozinho29_eu(a)hotmail.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Daniel Colascione <dancol(a)google.com>
Cc: Alexey Dobriyan <adobriyan(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/task_mmu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/proc/task_mmu.c~mm-proc-pid-smaps_rollup-fix-null-pointer-deref-in-smaps_pte_range
+++ a/fs/proc/task_mmu.c
@@ -713,6 +713,8 @@ static void smap_gather_stats(struct vm_
smaps_walk.private = mss;
#ifdef CONFIG_SHMEM
+ /* In case of smaps_rollup, reset the value from previous vma */
+ mss->check_shmem_swap = false;
if (vma->vm_file && shmem_mapping(vma->vm_file->f_mapping)) {
/*
* For shared or readonly shmem mappings we know that all
@@ -728,7 +730,7 @@ static void smap_gather_stats(struct vm_
if (!shmem_swapped || (vma->vm_flags & VM_SHARED) ||
!(vma->vm_flags & VM_WRITE)) {
- mss->swap = shmem_swapped;
+ mss->swap += shmem_swapped;
} else {
mss->check_shmem_swap = true;
smaps_walk.pte_hole = smaps_pte_hole;
_
Patches currently in -mm which might be from vbabka(a)suse.cz are
mm-proc-pid-smaps_rollup-fix-null-pointer-deref-in-smaps_pte_range.patch
mm-slab-combine-kmalloc_caches-and-kmalloc_dma_caches.patch
mm-slab-slub-introduce-kmalloc-reclaimable-caches.patch
dcache-allocate-external-names-from-reclaimable-kmalloc-caches.patch
mm-rename-and-change-semantics-of-nr_indirectly_reclaimable_bytes.patch
mm-proc-add-kreclaimable-to-proc-meminfo.patch
mm-slab-shorten-kmalloc-cache-names-for-large-sizes.patch
Cc: stable(a)vger.kernel.org <stable(a)vger.kernel.org>
Le 13/10/2018 à 11:16, Christophe Leroy a écrit :
> commit b96672dd840f ("powerpc: Machine check interrupt is a non-
> maskable interrupt") added a call to nmi_enter() at the beginning of
> machine check restart exception handler. Due to that, in_interrupt()
> always returns true regardless of the state before entering the
> exception, and die() panics even when the system was not already in
> interrupt.
>
> This patch calls nmi_exit() before calling die() in order to restore
> the interrupt state we had before calling nmi_enter()
>
> Fixes: b96672dd840f ("powerpc: Machine check interrupt is a non-maskable interrupt")
> Signed-off-by: Christophe Leroy <christophe.leroy(a)c-s.fr>
> ---
> arch/powerpc/kernel/traps.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index fd58749b4d6b..4f880c2a6e4c 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -765,12 +765,17 @@ void machine_check_exception(struct pt_regs *regs)
> if (check_io_access(regs))
> goto bail;
>
> - die("Machine check", regs, SIGBUS);
> -
> /* Must die if the interrupt is not recoverable */
> if (!(regs->msr & MSR_RI))
> nmi_panic(regs, "Unrecoverable Machine check");
>
> + if (!nested)
> + nmi_exit();
> +
> + die("Machine check", regs, SIGBUS);
> +
> + return;
> +
> bail:
> if (!nested)
> nmi_exit();
>
The Address Range Scrub implementation tried to skip running scrubs
against ranges that were already scrubbed by the BIOS. Unfortunately
that support also resulted in early scrub completions as evidenced by
this debug output from nfit_test:
nd_region region9: ARS: range 1 short complete
nd_region region3: ARS: range 1 short complete
nd_region region4: ARS: range 2 ARS start (0)
nd_region region4: ARS: range 2 short complete
...i.e. completions without any indications that the scrub was started.
This state of affairs was hard to see in the code due to the
proliferation of state bits and mistakenly trying to track done state
per-range when the completion is a global property of the bus.
So, kill the four ARS state bits (ARS_REQ, ARS_REQ_REDO, ARS_DONE, and
ARS_SHORT), and replace them with just 2 request flags ARS_REQ_SHORT and
ARS_REQ_LONG. The implementation will still complete and reap the
results of BIOS initiated ARS, but it will not attempt to use that
information to affect the completion status of scrubbing the ranges from
a Linux perspective.
Instead, try to synchronously run a short ARS per range at init time and
schedule a long scrub in the background. If ARS is busy with an ARS
request schedule both a short and a long scrub for when ARS returns to
idle. This logic also satisfies the intent of what ARS_REQ_REDO was
trying to achieve. The new rule is that the REQ flag stays set until the
next successful ars_start() for that range.
With the new policy that the REQ flags are not cleared until the next
start, the implementation no longer loses requests as can be seen from
the following log:
nd_region region3: ARS: range 1 ARS start short (0)
nd_region region9: ARS: range 1 ARS start short (0)
nd_region region3: ARS: range 1 complete
nd_region region4: ARS: range 2 ARS start short (0)
nd_region region9: ARS: range 1 complete
nd_region region9: ARS: range 1 ARS start long (0)
nd_region region4: ARS: range 2 complete
nd_region region3: ARS: range 1 ARS start long (0)
nd_region region9: ARS: range 1 complete
nd_region region3: ARS: range 1 complete
nd_region region4: ARS: range 2 ARS start long (0)
nd_region region4: ARS: range 2 complete
...note that the nfit_test emulated driver provides 2 buses, that is why
some of the range indices are duplicated. Notice that each range
now successfully completes a short and long scrub.
Cc: <stable(a)vger.kernel.org>
Fixes: 14c73f997a5e ("nfit, address-range-scrub: introduce nfit_spa->ars_state")
Fixes: cc3d3458d46f ("acpi/nfit: queue issuing of ars when an uc error...")
Reported-by: Jacek Zloch <jacek.zloch(a)intel.com>
Reported-by: Krzysztof Rusocki <krzysztof.rusocki(a)intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
drivers/acpi/nfit/core.c | 169 ++++++++++++++++++++++++++--------------------
drivers/acpi/nfit/nfit.h | 10 +--
2 files changed, 101 insertions(+), 78 deletions(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index a0dfbcf8220d..f7efcd9843e0 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -2572,7 +2572,8 @@ static int ars_get_cap(struct acpi_nfit_desc *acpi_desc,
return cmd_rc;
}
-static int ars_start(struct acpi_nfit_desc *acpi_desc, struct nfit_spa *nfit_spa)
+static int ars_start(struct acpi_nfit_desc *acpi_desc,
+ struct nfit_spa *nfit_spa, enum nfit_ars_state req_type)
{
int rc;
int cmd_rc;
@@ -2583,7 +2584,7 @@ static int ars_start(struct acpi_nfit_desc *acpi_desc, struct nfit_spa *nfit_spa
memset(&ars_start, 0, sizeof(ars_start));
ars_start.address = spa->address;
ars_start.length = spa->length;
- if (test_bit(ARS_SHORT, &nfit_spa->ars_state))
+ if (req_type == ARS_REQ_SHORT)
ars_start.flags = ND_ARS_RETURN_PREV_DATA;
if (nfit_spa_type(spa) == NFIT_SPA_PM)
ars_start.type = ND_ARS_PERSISTENT;
@@ -2640,6 +2641,15 @@ static void ars_complete(struct acpi_nfit_desc *acpi_desc,
struct nd_region *nd_region = nfit_spa->nd_region;
struct device *dev;
+ lockdep_assert_held(&acpi_desc->init_mutex);
+ /*
+ * Only advance the ARS state for ARS runs initiated by the
+ * kernel, ignore ARS results from BIOS initiated runs for scrub
+ * completion tracking.
+ */
+ if (acpi_desc->scrub_spa != nfit_spa)
+ return;
+
if ((ars_status->address >= spa->address && ars_status->address
< spa->address + spa->length)
|| (ars_status->address < spa->address)) {
@@ -2659,28 +2669,13 @@ static void ars_complete(struct acpi_nfit_desc *acpi_desc,
} else
return;
- if (test_bit(ARS_DONE, &nfit_spa->ars_state))
- return;
-
- if (!test_and_clear_bit(ARS_REQ, &nfit_spa->ars_state))
- return;
-
+ acpi_desc->scrub_spa = NULL;
if (nd_region) {
dev = nd_region_dev(nd_region);
nvdimm_region_notify(nd_region, NVDIMM_REVALIDATE_POISON);
} else
dev = acpi_desc->dev;
-
- dev_dbg(dev, "ARS: range %d %s complete\n", spa->range_index,
- test_bit(ARS_SHORT, &nfit_spa->ars_state)
- ? "short" : "long");
- clear_bit(ARS_SHORT, &nfit_spa->ars_state);
- if (test_and_clear_bit(ARS_REQ_REDO, &nfit_spa->ars_state)) {
- set_bit(ARS_SHORT, &nfit_spa->ars_state);
- set_bit(ARS_REQ, &nfit_spa->ars_state);
- dev_dbg(dev, "ARS: processing scrub request received while in progress\n");
- } else
- set_bit(ARS_DONE, &nfit_spa->ars_state);
+ dev_dbg(dev, "ARS: range %d complete\n", spa->range_index);
}
static int ars_status_process_records(struct acpi_nfit_desc *acpi_desc)
@@ -2961,46 +2956,55 @@ static int acpi_nfit_query_poison(struct acpi_nfit_desc *acpi_desc)
return 0;
}
-static int ars_register(struct acpi_nfit_desc *acpi_desc, struct nfit_spa *nfit_spa,
- int *query_rc)
+static int ars_register(struct acpi_nfit_desc *acpi_desc,
+ struct nfit_spa *nfit_spa)
{
- int rc = *query_rc;
+ int rc;
- if (no_init_ars)
+ if (no_init_ars || test_bit(ARS_FAILED, &nfit_spa->ars_state))
return acpi_nfit_register_region(acpi_desc, nfit_spa);
- set_bit(ARS_REQ, &nfit_spa->ars_state);
- set_bit(ARS_SHORT, &nfit_spa->ars_state);
+ set_bit(ARS_REQ_SHORT, &nfit_spa->ars_state);
+ set_bit(ARS_REQ_LONG, &nfit_spa->ars_state);
- switch (rc) {
+ switch (acpi_nfit_query_poison(acpi_desc)) {
case 0:
case -EAGAIN:
- rc = ars_start(acpi_desc, nfit_spa);
- if (rc == -EBUSY) {
- *query_rc = rc;
+ rc = ars_start(acpi_desc, nfit_spa, ARS_REQ_SHORT);
+ /* shouldn't happen, try again later */
+ if (rc == -EBUSY)
break;
- } else if (rc == 0) {
- rc = acpi_nfit_query_poison(acpi_desc);
- } else {
+ if (rc) {
set_bit(ARS_FAILED, &nfit_spa->ars_state);
break;
}
- if (rc == -EAGAIN)
- clear_bit(ARS_SHORT, &nfit_spa->ars_state);
- else if (rc == 0)
- ars_complete(acpi_desc, nfit_spa);
+ clear_bit(ARS_REQ_SHORT, &nfit_spa->ars_state);
+ rc = acpi_nfit_query_poison(acpi_desc);
+ if (rc)
+ break;
+ acpi_desc->scrub_spa = nfit_spa;
+ ars_complete(acpi_desc, nfit_spa);
+ /*
+ * If ars_complete() says we didn't complete the
+ * short scrub, we'll try again with a long
+ * request.
+ */
+ acpi_desc->scrub_spa = NULL;
break;
case -EBUSY:
+ case -ENOMEM:
case -ENOSPC:
+ /*
+ * BIOS was using ARS, wait for it to complete (or
+ * resources to become available) and then perform our
+ * own scrubs.
+ */
break;
default:
set_bit(ARS_FAILED, &nfit_spa->ars_state);
break;
}
- if (test_and_clear_bit(ARS_DONE, &nfit_spa->ars_state))
- set_bit(ARS_REQ, &nfit_spa->ars_state);
-
return acpi_nfit_register_region(acpi_desc, nfit_spa);
}
@@ -3022,6 +3026,8 @@ static unsigned int __acpi_nfit_scrub(struct acpi_nfit_desc *acpi_desc,
struct device *dev = acpi_desc->dev;
struct nfit_spa *nfit_spa;
+ lockdep_assert_held(&acpi_desc->init_mutex);
+
if (acpi_desc->cancel)
return 0;
@@ -3045,21 +3051,49 @@ static unsigned int __acpi_nfit_scrub(struct acpi_nfit_desc *acpi_desc,
ars_complete_all(acpi_desc);
list_for_each_entry(nfit_spa, &acpi_desc->spas, list) {
+ enum nfit_ars_state req_type;
+ int rc;
+
if (test_bit(ARS_FAILED, &nfit_spa->ars_state))
continue;
- if (test_bit(ARS_REQ, &nfit_spa->ars_state)) {
- int rc = ars_start(acpi_desc, nfit_spa);
-
- clear_bit(ARS_DONE, &nfit_spa->ars_state);
- dev = nd_region_dev(nfit_spa->nd_region);
- dev_dbg(dev, "ARS: range %d ARS start (%d)\n",
- nfit_spa->spa->range_index, rc);
- if (rc == 0 || rc == -EBUSY)
- return 1;
- dev_err(dev, "ARS: range %d ARS failed (%d)\n",
- nfit_spa->spa->range_index, rc);
- set_bit(ARS_FAILED, &nfit_spa->ars_state);
+
+ /* prefer short ARS requests first */
+ if (test_bit(ARS_REQ_SHORT, &nfit_spa->ars_state))
+ req_type = ARS_REQ_SHORT;
+ else if (test_bit(ARS_REQ_LONG, &nfit_spa->ars_state))
+ req_type = ARS_REQ_LONG;
+ else
+ continue;
+ rc = ars_start(acpi_desc, nfit_spa, req_type);
+
+ dev = nd_region_dev(nfit_spa->nd_region);
+ dev_dbg(dev, "ARS: range %d ARS start %s (%d)\n",
+ nfit_spa->spa->range_index,
+ req_type == ARS_REQ_SHORT ? "short" : "long",
+ rc);
+ /*
+ * Hmm, we raced someone else starting ARS? Try again in
+ * a bit.
+ */
+ if (rc == -EBUSY)
+ return 1;
+ if (rc == 0) {
+ dev_WARN_ONCE(dev, acpi_desc->scrub_spa,
+ "scrub start while range %d active\n",
+ acpi_desc->scrub_spa->spa->range_index);
+ clear_bit(req_type, &nfit_spa->ars_state);
+ acpi_desc->scrub_spa = nfit_spa;
+ /*
+ * Consider this spa last for future scrub
+ * requests
+ */
+ list_move_tail(&nfit_spa->list, &acpi_desc->spas);
+ return 1;
}
+
+ dev_err(dev, "ARS: range %d ARS failed (%d)\n",
+ nfit_spa->spa->range_index, rc);
+ set_bit(ARS_FAILED, &nfit_spa->ars_state);
}
return 0;
}
@@ -3115,6 +3149,7 @@ static void acpi_nfit_init_ars(struct acpi_nfit_desc *acpi_desc,
struct nd_cmd_ars_cap ars_cap;
int rc;
+ set_bit(ARS_FAILED, &nfit_spa->ars_state);
memset(&ars_cap, 0, sizeof(ars_cap));
rc = ars_get_cap(acpi_desc, &ars_cap, nfit_spa);
if (rc < 0)
@@ -3131,16 +3166,14 @@ static void acpi_nfit_init_ars(struct acpi_nfit_desc *acpi_desc,
nfit_spa->clear_err_unit = ars_cap.clear_err_unit;
acpi_desc->max_ars = max(nfit_spa->max_ars, acpi_desc->max_ars);
clear_bit(ARS_FAILED, &nfit_spa->ars_state);
- set_bit(ARS_REQ, &nfit_spa->ars_state);
}
static int acpi_nfit_register_regions(struct acpi_nfit_desc *acpi_desc)
{
struct nfit_spa *nfit_spa;
- int rc, query_rc;
+ int rc;
list_for_each_entry(nfit_spa, &acpi_desc->spas, list) {
- set_bit(ARS_FAILED, &nfit_spa->ars_state);
switch (nfit_spa_type(nfit_spa->spa)) {
case NFIT_SPA_VOLATILE:
case NFIT_SPA_PM:
@@ -3149,20 +3182,12 @@ static int acpi_nfit_register_regions(struct acpi_nfit_desc *acpi_desc)
}
}
- /*
- * Reap any results that might be pending before starting new
- * short requests.
- */
- query_rc = acpi_nfit_query_poison(acpi_desc);
- if (query_rc == 0)
- ars_complete_all(acpi_desc);
-
list_for_each_entry(nfit_spa, &acpi_desc->spas, list)
switch (nfit_spa_type(nfit_spa->spa)) {
case NFIT_SPA_VOLATILE:
case NFIT_SPA_PM:
/* register regions and kick off initial ARS run */
- rc = ars_register(acpi_desc, nfit_spa, &query_rc);
+ rc = ars_register(acpi_desc, nfit_spa);
if (rc)
return rc;
break;
@@ -3374,7 +3399,8 @@ static int acpi_nfit_clear_to_send(struct nvdimm_bus_descriptor *nd_desc,
return __acpi_nfit_clear_to_send(nd_desc, nvdimm, cmd);
}
-int acpi_nfit_ars_rescan(struct acpi_nfit_desc *acpi_desc, unsigned long flags)
+int acpi_nfit_ars_rescan(struct acpi_nfit_desc *acpi_desc,
+ enum nfit_ars_state req_type)
{
struct device *dev = acpi_desc->dev;
int scheduled = 0, busy = 0;
@@ -3394,14 +3420,10 @@ int acpi_nfit_ars_rescan(struct acpi_nfit_desc *acpi_desc, unsigned long flags)
if (test_bit(ARS_FAILED, &nfit_spa->ars_state))
continue;
- if (test_and_set_bit(ARS_REQ, &nfit_spa->ars_state)) {
+ if (test_and_set_bit(req_type, &nfit_spa->ars_state))
busy++;
- set_bit(ARS_REQ_REDO, &nfit_spa->ars_state);
- } else {
- if (test_bit(ARS_SHORT, &flags))
- set_bit(ARS_SHORT, &nfit_spa->ars_state);
+ else
scheduled++;
- }
}
if (scheduled) {
sched_ars(acpi_desc);
@@ -3587,10 +3609,11 @@ static void acpi_nfit_update_notify(struct device *dev, acpi_handle handle)
static void acpi_nfit_uc_error_notify(struct device *dev, acpi_handle handle)
{
struct acpi_nfit_desc *acpi_desc = dev_get_drvdata(dev);
- unsigned long flags = (acpi_desc->scrub_mode == HW_ERROR_SCRUB_ON) ?
- 0 : 1 << ARS_SHORT;
- acpi_nfit_ars_rescan(acpi_desc, flags);
+ if (acpi_desc->scrub_mode == HW_ERROR_SCRUB_ON)
+ acpi_nfit_ars_rescan(acpi_desc, ARS_REQ_LONG);
+ else
+ acpi_nfit_ars_rescan(acpi_desc, ARS_REQ_SHORT);
}
void __acpi_nfit_notify(struct device *dev, acpi_handle handle, u32 event)
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
index 8c1af38a5dee..a7d95b427efd 100644
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -133,10 +133,8 @@ enum nfit_dimm_notifiers {
};
enum nfit_ars_state {
- ARS_REQ,
- ARS_REQ_REDO,
- ARS_DONE,
- ARS_SHORT,
+ ARS_REQ_SHORT,
+ ARS_REQ_LONG,
ARS_FAILED,
};
@@ -223,6 +221,7 @@ struct acpi_nfit_desc {
struct device *dev;
u8 ars_start_flags;
struct nd_cmd_ars_status *ars_status;
+ struct nfit_spa *scrub_spa;
struct delayed_work dwork;
struct list_head list;
struct kernfs_node *scrub_count_state;
@@ -277,7 +276,8 @@ struct nfit_blk {
extern struct list_head acpi_descs;
extern struct mutex acpi_desc_lock;
-int acpi_nfit_ars_rescan(struct acpi_nfit_desc *acpi_desc, unsigned long flags);
+int acpi_nfit_ars_rescan(struct acpi_nfit_desc *acpi_desc,
+ enum nfit_ars_state req_type);
#ifdef CONFIG_X86_MCE
void nfit_mce_register(void);
Did you get my email from last week?
Let me know if you have photos for cutting out or retouching?
We are an image team who can do editing for your the web store photos,
industry photos or portrait photos.
Send photos, we will do testing for you to check quality.
Waiting for your reply soon.
Thanks,
Judy
The recent patch to fix the afs_server struct leak didn't actually fix the
bug, but rather fixed some of the symptoms. The problem is that an
asynchronous call that holds a resource pointed to by call->reply[0] will
find the pointer cleared in the call destructor, thereby preventing the
resource from being cleaned up.
In the case of the server record leak, the afs_fs_get_capabilities()
function in devel code sets up a call with reply[0] pointing at the server
record that should be altered when the result is obtained, but this was
being cleared before the destructor was called, so the put in the
destructor does nothing and the record is leaked.
Commit f014ffb025c1 removed the additional ref obtained by
afs_install_server(), but the removal of this ref is actually used by the
garbage collector to mark a server record as being defunct after the record
has expired through lack of use.
The offending clearance of call->reply[0] upon completion in
afs_process_async_call() has been there from the origin of the code, but
none of the asynchronous calls actually use that pointer currently, so it
should be safe to remove (note that synchronous calls don't involve this
function).
Fix this by the following means:
(1) Revert commit f014ffb025c1.
(2) Remove the clearance of reply[0] from afs_process_async_call().
Without this, afs_manage_servers() will suffer an assertion failure if it
sees a server record that didn't get used because the usage count is not 1.
Fixes: f014ffb025c1 ("afs: Fix afs_server struct leak")
Fixes: 08e0e7c82eea ("[AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC.")
Signed-off-by: David Howells <dhowells(a)redhat.com>
---
fs/afs/rxrpc.c | 2 --
fs/afs/server.c | 2 --
2 files changed, 4 deletions(-)
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 748b37b130a2..9bbb8af000b4 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -620,8 +620,6 @@ static void afs_process_async_call(struct work_struct *work)
}
if (call->state == AFS_CALL_COMPLETE) {
- call->reply[0] = NULL;
-
/* We have two refs to release - one from the alloc and one
* queued with the work item - and we can't just deallocate the
* call because the work item may be queued again.
diff --git a/fs/afs/server.c b/fs/afs/server.c
index d5ef05e24e18..1a087eb8f2d7 100644
--- a/fs/afs/server.c
+++ b/fs/afs/server.c
@@ -199,11 +199,9 @@ static struct afs_server *afs_install_server(struct afs_net *net,
write_sequnlock(&net->fs_addr_lock);
ret = 0;
- goto out;
exists:
afs_get_server(server);
-out:
write_sequnlock(&net->fs_lock);
return server;
}
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: cec: forgot to cancel delayed work
Author: Hans Verkuil <hverkuil(a)xs4all.nl>
Date: Mon Oct 15 06:14:22 2018 -0400
If the wait for completion was interrupted, then make sure to cancel
any delayed work.
This can only happen if a transmit is waiting for a reply, and you press
Ctrl-C or reboot/poweroff or something like that which interrupts the
thread waiting for the reply and then proceeds to delete the CEC message.
Since the delayed work wasn't canceled, once it would trigger it referred
to stale data and resulted in a kernel oops.
Fixes: 7ec2b3b941a6 ("cec: add new tx/rx status bits to detect aborts/timeouts")
Signed-off-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Cc: <stable(a)vger.kernel.org> # for v4.18 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung(a)kernel.org>
drivers/media/cec/cec-adap.c | 2 ++
1 file changed, 2 insertions(+)
---
diff --git a/drivers/media/cec/cec-adap.c b/drivers/media/cec/cec-adap.c
index 0c0d9107383e..31d1f4ab915e 100644
--- a/drivers/media/cec/cec-adap.c
+++ b/drivers/media/cec/cec-adap.c
@@ -844,6 +844,8 @@ int cec_transmit_msg_fh(struct cec_adapter *adap, struct cec_msg *msg,
*/
mutex_unlock(&adap->lock);
wait_for_completion_killable(&data->c);
+ if (!data->completed)
+ cancel_delayed_work_sync(&data->work);
mutex_lock(&adap->lock);
/* Cancel the transmit if it was interrupted */
Hi all,
Three fixes that worth to have in the @stable, as they were hit by
different people, including Arista on v4.9 stable.
And for linux-next - adding lockdep asserts for line discipline changing
code, verifying that write ldisc sem will be held forthwith.
The last patch is an optional and probably, timeout can be dropped for
read_lock(). I'll do it if everyone agrees.
(Or as per discussion with Peter in v3, just convert ldisc to
a regular rwsem).
Thanks,
Dima
Changes since v4:
- back to lock ldisc with (5*HZ) timeout in tty_reopen()
(LKP report link: lkml.kernel.org/r/<1536940609.3185.29.camel(a)arista.com>)
- reordered 3/7 with 2/7 for LKP robot
Changes since v3:
- Added tested-by Mark Rutland (thanks!)
- Dropped patch with smp_wmb() - wrong idea
- lockdep_assert_held() should be actually lockdep_assert_held_exclusive()
- Described why tty_ldisc_open() can be called without ldisc_sem held
for pty slave end (o_tty).
- Added Peter's patch for dropping self-made lockdep annotations
- Fix for a reader(s) of ldisc semaphore waiting for an active reader(s)
Changes since v2:
- Added reviewed-by tags
- Hopefully, fixed reported by 0-day issue.
- Added optional fix for wait_readers decrement
Changes since v1:
- Added tested-by/reported-by tags
- Dropped 3/4 (locking tty pair for lockdep sake),
Because of that - not adding lockdep_assert_held() in tty_ldisc_open()
- Added 4/4 cleanup to inc tty->count only on success of
tty_ldisc_reinit()
- lock ldisc without (5*HZ) timeout in tty_reopen()
v1 link: lkml.kernel.org/r/<20180829022353.23568-1-dima(a)arista.com>
v2 link: lkml.kernel.org/r/<20180903165257.29227-1-dima(a)arista.com>
v3 link: lkml.kernel.org/r/<20180911014821.26286-1-dima(a)arista.com>
v4 link: lkml.kernel.org/r/<20180912001702.18522-1-dima(a)arista.com>
Cc: Daniel Axtens <dja(a)axtens.net>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Michael Neuling <mikey(a)neuling.org>
Cc: Mikulas Patocka <mpatocka(a)redhat.com>
Cc: Nathan March <nathan(a)gt.net>
Cc: Pasi Kärkkäinen <pasik(a)iki.fi>
Cc: Peter Hurley <peter(a)hurleysoftware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: "Rong, Chen" <rong.a.chen(a)intel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work(a)gmail.com>
Cc: Tan Xiaojun <tanxiaojun(a)huawei.com>
Cc: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
(please, ignore if I Cc'ed you mistakenly)
Dmitry Safonov (6):
tty: Drop tty->count on tty_reopen() failure
tty/ldsem: Wake up readers after timed out down_write()
tty: Hold tty_ldisc_lock() during tty_reopen()
tty: Simplify tty->count math in tty_reopen()
tty/ldsem: Add lockdep asserts for ldisc_sem
tty/ldsem: Decrement wait_readers on timeouted down_read()
Peter Zijlstra (1):
tty/ldsem: Convert to regular lockdep annotations
drivers/tty/tty_io.c | 13 ++++++++---
drivers/tty/tty_ldisc.c | 9 +++++++
drivers/tty/tty_ldsem.c | 62 ++++++++++++++++++++-----------------------------
3 files changed, 44 insertions(+), 40 deletions(-)
--
2.13.6