The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ef68e831840c40c7d01b328b3c0f5d8c4796c232 Mon Sep 17 00:00:00 2001
From: Pavel Shilovsky <pshilov(a)microsoft.com>
Date: Fri, 18 Jan 2019 17:25:36 -0800
Subject: [PATCH] CIFS: Do not reconnect TCP session in add_credits()
When executing add_credits() we currently call cifs_reconnect()
if the number of credits is zero and there are no requests in
flight. In this case we may call cifs_reconnect() recursively
twice and cause memory corruption given the following sequence
of functions:
mid1.callback() -> add_credits() -> cifs_reconnect() ->
-> mid2.callback() -> add_credits() -> cifs_reconnect().
Fix this by avoiding to call cifs_reconnect() in add_credits()
and checking for zero credits in the demultiplex thread.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov(a)microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber(a)redhat.com>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 683310f26171..8463c940e0e5 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -720,6 +720,21 @@ server_unresponsive(struct TCP_Server_Info *server)
return false;
}
+static inline bool
+zero_credits(struct TCP_Server_Info *server)
+{
+ int val;
+
+ spin_lock(&server->req_lock);
+ val = server->credits + server->echo_credits + server->oplock_credits;
+ if (server->in_flight == 0 && val == 0) {
+ spin_unlock(&server->req_lock);
+ return true;
+ }
+ spin_unlock(&server->req_lock);
+ return false;
+}
+
static int
cifs_readv_from_socket(struct TCP_Server_Info *server, struct msghdr *smb_msg)
{
@@ -732,6 +747,12 @@ cifs_readv_from_socket(struct TCP_Server_Info *server, struct msghdr *smb_msg)
for (total_read = 0; msg_data_left(smb_msg); total_read += length) {
try_to_freeze();
+ /* reconnect if no credits and no requests in flight */
+ if (zero_credits(server)) {
+ cifs_reconnect(server);
+ return -ECONNABORTED;
+ }
+
if (server_unresponsive(server))
return -ECONNABORTED;
if (cifs_rdma_enabled(server) && server->smbd_conn)
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index 1d3ce127b02e..fa8d2e1076c8 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -34,6 +34,7 @@
#include "cifs_ioctl.h"
#include "smbdirect.h"
+/* Change credits for different ops and return the total number of credits */
static int
change_conf(struct TCP_Server_Info *server)
{
@@ -41,17 +42,15 @@ change_conf(struct TCP_Server_Info *server)
server->oplock_credits = server->echo_credits = 0;
switch (server->credits) {
case 0:
- return -1;
+ return 0;
case 1:
server->echoes = false;
server->oplocks = false;
- cifs_dbg(VFS, "disabling echoes and oplocks\n");
break;
case 2:
server->echoes = true;
server->oplocks = false;
server->echo_credits = 1;
- cifs_dbg(FYI, "disabling oplocks\n");
break;
default:
server->echoes = true;
@@ -64,14 +63,15 @@ change_conf(struct TCP_Server_Info *server)
server->echo_credits = 1;
}
server->credits -= server->echo_credits + server->oplock_credits;
- return 0;
+ return server->credits + server->echo_credits + server->oplock_credits;
}
static void
smb2_add_credits(struct TCP_Server_Info *server, const unsigned int add,
const int optype)
{
- int *val, rc = 0;
+ int *val, rc = -1;
+
spin_lock(&server->req_lock);
val = server->ops->get_credits_field(server, optype);
@@ -101,8 +101,26 @@ smb2_add_credits(struct TCP_Server_Info *server, const unsigned int add,
}
spin_unlock(&server->req_lock);
wake_up(&server->request_q);
- if (rc)
- cifs_reconnect(server);
+
+ if (server->tcpStatus == CifsNeedReconnect)
+ return;
+
+ switch (rc) {
+ case -1:
+ /* change_conf hasn't been executed */
+ break;
+ case 0:
+ cifs_dbg(VFS, "Possible client or server bug - zero credits\n");
+ break;
+ case 1:
+ cifs_dbg(VFS, "disabling echoes and oplocks\n");
+ break;
+ case 2:
+ cifs_dbg(FYI, "disabling oplocks\n");
+ break;
+ default:
+ cifs_dbg(FYI, "add %u credits total=%d\n", add, rc);
+ }
}
static void
When resuming, we check whether or not any previously connected
MST topologies are still present and if so, attempt to resume them. If
this fails, we disable said MST topologies and fire off a hotplug event
so that userspace knows to reprobe.
However, sending a hotplug event involves calling
drm_fb_helper_hotplug_event(), which in turn results in fbcon doing a
connector reprobe in the caller's thread - something we can't do at the
point in which i915 calls drm_dp_mst_topology_mgr_resume() since
hotplugging hasn't been fully initialized yet.
This currently causes some rather subtle but fatal issues. For example,
on my T480s the laptop dock connected to it usually disappears during a
suspend cycle, and comes back up a short while after the system has been
resumed. This guarantees pretty much every suspend and resume cycle,
drm_dp_mst_topology_mgr_set_mst(mgr, false); will be caused and in turn,
a connector hotplug will occur. Now it's Rute Goldberg time: when the
connector hotplug occurs, i915 reprobes /all/ of the connectors,
including eDP. However, eDP probing requires that we power on the panel
VDD which in turn, grabs a wakeref to the appropriate power domain on
the GPU (on my T480s, this is the PORT_DDI_A_IO domain). This is where
things start breaking, since this all happens before
intel_power_domains_enable() is called we end up leaking the wakeref
that was acquired and never releasing it later. Come next suspend/resume
cycle, this causes us to fail to shut down the GPU properly, which
causes it not to resume properly and die a horrible complicated death.
(as a note: this only happens when there's both an eDP panel and MST
topology connected which is removed mid-suspend. One or the other seems
to always be OK).
We could try to fix the VDD wakeref leak, but this doesn't seem like
it's worth it at all since we aren't able to handle hotplug detection
while resuming anyway. So, let's go with a more robust solution inspired
by nouveau: block fbdev from handling hotplug events until we resume
fbdev. This allows us to still send sysfs hotplug events to be handled
later by user space while we're resuming, while also preventing us from
actually processing any hotplug events we receive until it's safe.
This fixes the wakeref leak observed on the T480s and as such, also
fixes suspend/resume with MST topologies connected on this machine.
Changes since v2:
* Don't call drm_fb_helper_hotplug_event() under lock, do it after lock
(Chris Wilson)
* Don't call drm_fb_helper_hotplug_event() in
intel_fbdev_output_poll_changed() under lock (Chris Wilson)
* Always set ifbdev->hpd_waiting (Chris Wilson)
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Fixes: 0e32b39ceed6 ("drm/i915: add DP 1.2 MST support (v0.7)")
Cc: Todd Previte <tprevite(a)gmail.com>
Cc: Dave Airlie <airlied(a)redhat.com>
Cc: Jani Nikula <jani.nikula(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: Imre Deak <imre.deak(a)intel.com>
Cc: intel-gfx(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v3.17+
---
drivers/gpu/drm/i915/intel_drv.h | 10 +++++++++
drivers/gpu/drm/i915/intel_fbdev.c | 33 +++++++++++++++++++++++++++++-
2 files changed, 42 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 90ba5436370e..9ec3d00fbd19 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -213,6 +213,16 @@ struct intel_fbdev {
unsigned long vma_flags;
async_cookie_t cookie;
int preferred_bpp;
+
+ /* Whether or not fbdev hpd processing is temporarily suspended */
+ bool hpd_suspended : 1;
+ /* Set when a hotplug was received while HPD processing was
+ * suspended
+ */
+ bool hpd_waiting : 1;
+
+ /* Protects hpd_suspended */
+ struct mutex hpd_lock;
};
struct intel_encoder {
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 8cf3efe88f02..376ffe842e26 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -681,6 +681,7 @@ int intel_fbdev_init(struct drm_device *dev)
if (ifbdev == NULL)
return -ENOMEM;
+ mutex_init(&ifbdev->hpd_lock);
drm_fb_helper_prepare(dev, &ifbdev->helper, &intel_fb_helper_funcs);
if (!intel_fbdev_init_bios(dev, ifbdev))
@@ -754,6 +755,26 @@ void intel_fbdev_fini(struct drm_i915_private *dev_priv)
intel_fbdev_destroy(ifbdev);
}
+/* Suspends/resumes fbdev processing of incoming HPD events. When resuming HPD
+ * processing, fbdev will perform a full connector reprobe if a hotplug event
+ * was received while HPD was suspended.
+ */
+static void intel_fbdev_hpd_set_suspend(struct intel_fbdev *ifbdev, int state)
+{
+ bool send_hpd = false;
+
+ mutex_lock(&ifbdev->hpd_lock);
+ ifbdev->hpd_suspended = state == FBINFO_STATE_SUSPENDED;
+ send_hpd = !ifbdev->hpd_suspended && ifbdev->hpd_waiting;
+ ifbdev->hpd_waiting = false;
+ mutex_unlock(&ifbdev->hpd_lock);
+
+ if (send_hpd) {
+ DRM_DEBUG_KMS("Handling delayed fbcon HPD event\n");
+ drm_fb_helper_hotplug_event(&ifbdev->helper);
+ }
+}
+
void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous)
{
struct drm_i915_private *dev_priv = to_i915(dev);
@@ -775,6 +796,7 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
*/
if (state != FBINFO_STATE_RUNNING)
flush_work(&dev_priv->fbdev_suspend_work);
+
console_lock();
} else {
/*
@@ -802,17 +824,26 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
drm_fb_helper_set_suspend(&ifbdev->helper, state);
console_unlock();
+
+ intel_fbdev_hpd_set_suspend(ifbdev, state);
}
void intel_fbdev_output_poll_changed(struct drm_device *dev)
{
struct intel_fbdev *ifbdev = to_i915(dev)->fbdev;
+ bool send_hpd;
if (!ifbdev)
return;
intel_fbdev_sync(ifbdev);
- if (ifbdev->vma || ifbdev->helper.deferred_setup)
+
+ mutex_lock(&ifbdev->hpd_lock);
+ send_hpd = !ifbdev->hpd_suspended;
+ ifbdev->hpd_waiting = true;
+ mutex_unlock(&ifbdev->hpd_lock);
+
+ if (send_hpd && (ifbdev->vma || ifbdev->helper.deferred_setup))
drm_fb_helper_hotplug_event(&ifbdev->helper);
}
--
2.20.1
The patch titled
Subject: Revert "mm, memory_hotplug: initialize struct pages for the full memory section"
has been removed from the -mm tree. Its filename was
revert-mm-memory_hotplug-initialize-struct-pages-for-the-full-memory-section.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Michal Hocko <mhocko(a)suse.com>
Subject: Revert "mm, memory_hotplug: initialize struct pages for the full memory section"
This reverts 2830bf6f05fb3e05b ("mm, memory_hotplug: initialize struct
pages for the full memory section").
The underlying assumption that one sparse section belongs into a single
numa node doesn't hold really. Robert Shteynfeld has reported a boot
failure. The boot log was not captured but his memory layout is as
follows:
[ 0.286954] Early memory node ranges
[ 0.286955] node 1: [mem 0x0000000000001000-0x0000000000090fff]
[ 0.286955] node 1: [mem 0x0000000000100000-0x00000000dbdf8fff]
[ 0.286956] node 1: [mem 0x0000000100000000-0x0000001423ffffff]
[ 0.286956] node 0: [mem 0x0000001424000000-0x0000002023ffffff]
This means that node0 starts in the middle of a memory section which is
also in node1. memmap_init_zone tries to initialize padding of a section
even when it is outside of the given pfn range because there are code
paths (e.g. memory hotplug) which assume that the full worth of memory
section is always initialized. In this particular case, though, such a
range is already initialized and most likely already managed by the page
allocator. Scribbling over those pages corrupts the internal state and
likely blows up when any of those pages gets used.
Link: http://lkml.kernel.org/r/20190125181549.GE20411@dhcp22.suse.cz
Fixes: 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full memory section")
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Reported-by: Robert Shteynfeld <robert.shteynfeld(a)gmail.com>
Cc: Mikhail Zaslonko <zaslonko(a)linux.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer(a)de.ibm.com>
Cc: Mikhail Gavrilov <mikhail.v.gavrilov(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Alexander Duyck <alexander.h.duyck(a)linux.intel.com>
Cc: Pasha Tatashin <Pavel.Tatashin(a)microsoft.com>
Cc: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
Cc: Heiko Carstens <heiko.carstens(a)de.ibm.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 12 ------------
1 file changed, 12 deletions(-)
--- a/mm/page_alloc.c~revert-mm-memory_hotplug-initialize-struct-pages-for-the-full-memory-section
+++ a/mm/page_alloc.c
@@ -5701,18 +5701,6 @@ void __meminit memmap_init_zone(unsigned
cond_resched();
}
}
-#ifdef CONFIG_SPARSEMEM
- /*
- * If the zone does not span the rest of the section then
- * we should at least initialize those pages. Otherwise we
- * could blow up on a poisoned page in some paths which depend
- * on full sections being initialized (e.g. memory hotplug).
- */
- while (end_pfn % PAGES_PER_SECTION) {
- __init_single_page(pfn_to_page(end_pfn), end_pfn, zone, nid);
- end_pfn++;
- }
-#endif
}
#ifdef CONFIG_ZONE_DEVICE
_
Patches currently in -mm which might be from mhocko(a)suse.com are
mm-memory_hotplug-is_mem_section_removable-do-not-pass-the-end-of-a-zone.patch
mm-oom-marks-all-killed-tasks-as-oom-victims.patch
memcg-do-not-report-racy-no-eligible-oom-tasks.patch