Hi Greg/Sasha:
Please apply this series to v4.9.y stable tree. The reason is that v4.9-stable
branch has this patch for stmmac ethernet driver:
"6f37f7b62baa net: stmmac: socfpga: add additional ocp reset line for
Stratix10"
This patch calls devm_reset_control_get_optional(). This call ultimately
fails without this reset patch, becuase the call to
devm_reset_control_get_optional() is returning an error code, which causes the
ethernet driver to fail to load. This patch fixes the call to *_get_optional_*
in that any call to the reset driver with *_optional_* will return 0 instead of
an error value.
Patches 2-6 are needed as well because those patches fix patch 1/6.
Thanks..
Dinh
Heiner Kallweit (1):
reset: core: fix reset_control_put
Masahiro Yamada (2):
reset: make device_reset_optional() really optional
reset: remove remaining WARN_ON() in <linux/reset.h>
Philipp Zabel (2):
reset: fix optional reset_control_get stubs to return NULL
reset: add exported __reset_control_get, return NULL if optional
Ramiro Oliveira (1):
reset: make optional functions really optional
drivers/reset/core.c | 79 +++++++++++++++++++++++++++----------
include/linux/reset.h | 92 +++++++++++++++++++++----------------------
2 files changed, 104 insertions(+), 67 deletions(-)
--
2.17.1
This is a note to let you know that I've just added the patch titled
kgdboc: fix KASAN global-out-of-bounds bug in param_set_kgdboc_var()
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From dada6a43b0402eba438a17ac86fdc64ac56a4607 Mon Sep 17 00:00:00 2001
From: Macpaul Lin <macpaul(a)gmail.com>
Date: Wed, 17 Oct 2018 23:08:38 +0800
Subject: kgdboc: fix KASAN global-out-of-bounds bug in param_set_kgdboc_var()
This patch is trying to fix KE issue due to
"BUG: KASAN: global-out-of-bounds in param_set_kgdboc_var+0x194/0x198"
reported by Syzkaller scan."
[26364:syz-executor0][name:report8t]BUG: KASAN: global-out-of-bounds in param_set_kgdboc_var+0x194/0x198
[26364:syz-executor0][name:report&]Read of size 1 at addr ffffff900e44f95f by task syz-executor0/26364
[26364:syz-executor0][name:report&]
[26364:syz-executor0]CPU: 7 PID: 26364 Comm: syz-executor0 Tainted: G W 0
[26364:syz-executor0]Call trace:
[26364:syz-executor0][<ffffff9008095cf8>] dump_bacIctrace+Ox0/0x470
[26364:syz-executor0][<ffffff9008096de0>] show_stack+0x20/0x30
[26364:syz-executor0][<ffffff90089cc9c8>] dump_stack+Oxd8/0x128
[26364:syz-executor0][<ffffff90084edb38>] print_address_description +0x80/0x4a8
[26364:syz-executor0][<ffffff90084ee270>] kasan_report+Ox178/0x390
[26364:syz-executor0][<ffffff90084ee4a0>] _asan_report_loadi_noabort+Ox18/0x20
[26364:syz-executor0][<ffffff9008b092ac>] param_set_kgdboc_var+Ox194/0x198
[26364:syz-executor0][<ffffff900813af64>] param_attr_store+Ox14c/0x270
[26364:syz-executor0][<ffffff90081394c8>] module_attr_store+0x60/0x90
[26364:syz-executor0][<ffffff90086690c0>] sysfs_kl_write+Ox100/0x158
[26364:syz-executor0][<ffffff9008666d84>] kernfs_fop_write+0x27c/0x3a8
[26364:syz-executor0][<ffffff9008508264>] do_loop_readv_writev+0x114/0x1b0
[26364:syz-executor0][<ffffff9008509ac8>] do_readv_writev+0x4f8/0x5e0
[26364:syz-executor0][<ffffff9008509ce4>] vfs_writev+0x7c/Oxb8
[26364:syz-executor0][<ffffff900850ba64>] SyS_writev+Oxcc/0x208
[26364:syz-executor0][<ffffff90080883f0>] elO_svc_naked +0x24/0x28
[26364:syz-executor0][name:report&]
[26364:syz-executor0][name:report&]The buggy address belongs to the variable:
[26364:syz-executor0][name:report&] kgdb_tty_line+Ox3f/0x40
[26364:syz-executor0][name:report&]
[26364:syz-executor0][name:report&]Memory state around the buggy address:
[26364:syz-executor0] ffffff900e44f800: 00 00 00 00 00 04 fa fa fa fa fa fa 00 fa fa fa
[26364:syz-executor0] ffffff900e44f880: fa fa fa fa 00 fa fa fa fa fa fa fa 00 fa fa fa
[26364:syz-executor0]> ffffff900e44f900: fa fa fa fa 04 fa fa fa fa fa fa fa 00 00 00 00
[26364:syz-executor0][name:report&] ^
[26364:syz-executor0] ffffff900e44f980: 00 fa fa fa fa fa fa fa 04 fa fa fa fa fa fa fa
[26364:syz-executor0] ffffff900e44fa00: 04 fa fa fa fa fa fa fa 00 fa fa fa fa fa fa fa
[26364:syz-executor0][name:report&]
[26364:syz-executor0][name:panic&]Disabling lock debugging due to kernel taint
[26364:syz-executor0]------------[cut here]------------
After checking the source code, we've found there might be an out-of-bounds
access to "config[len - 1]" array when the variable "len" is zero.
Signed-off-by: Macpaul Lin <macpaul(a)gmail.com>
Acked-by: Daniel Thompson <daniel.thompson(a)linaro.org>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/kgdboc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
index baeeeaec3f03..6fb312e7af71 100644
--- a/drivers/tty/serial/kgdboc.c
+++ b/drivers/tty/serial/kgdboc.c
@@ -233,7 +233,7 @@ static void kgdboc_put_char(u8 chr)
static int param_set_kgdboc_var(const char *kmessage,
const struct kernel_param *kp)
{
- int len = strlen(kmessage);
+ size_t len = strlen(kmessage);
if (len >= MAX_CONFIG_LEN) {
pr_err("config string too long\n");
@@ -254,7 +254,7 @@ static int param_set_kgdboc_var(const char *kmessage,
strcpy(config, kmessage);
/* Chop out \n char as a result of echo */
- if (config[len - 1] == '\n')
+ if (len && config[len - 1] == '\n')
config[len - 1] = '\0';
if (configured == 1)
--
2.19.2
The driver defines three states for a cppi channel.
- idle: .chan_busy == 0 && not in .pending list
- pending: .chan_busy == 0 && in .pending list
- busy: .chan_busy == 1 && not in .pending list
There are cases in which the cppi channel could be in the pending state
when cppi41_dma_issue_pending() is called after cppi41_runtime_suspend()
is called.
cppi41_stop_chan() has a bug for these cases to set channels to idle state.
It only checks the .chan_busy flag, but not the .pending list, then later
when cppi41_runtime_resume() is called the channels in .pending list will
be transitioned to busy state.
Removing channels from the .pending list solves the problem.
Fixes: 975faaeb9985 ("dma: cppi41: start tear down only if channel is busy")
Cc: stable(a)vger.kernel.org # v3.15+
Signed-off-by: Bin Liu <b-liu(a)ti.com>
---
drivers/dma/ti/cppi41.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/dma/ti/cppi41.c b/drivers/dma/ti/cppi41.c
index 1497da367710..e507ec36c0d3 100644
--- a/drivers/dma/ti/cppi41.c
+++ b/drivers/dma/ti/cppi41.c
@@ -723,8 +723,22 @@ static int cppi41_stop_chan(struct dma_chan *chan)
desc_phys = lower_32_bits(c->desc_phys);
desc_num = (desc_phys - cdd->descs_phys) / sizeof(struct cppi41_desc);
- if (!cdd->chan_busy[desc_num])
+ if (!cdd->chan_busy[desc_num]) {
+ struct cppi41_channel *cc, *_ct;
+
+ /*
+ * channels might still be in the pendling list if
+ * cppi41_dma_issue_pending() is called after
+ * cppi41_runtime_suspend() is called
+ */
+ list_for_each_entry_safe(cc, _ct, &cdd->pending, node) {
+ if (cc != c)
+ continue;
+ list_del(&cc->node);
+ break;
+ }
return 0;
+ }
ret = cppi41_tear_down_chan(c);
if (ret)
--
2.17.1
I've backported a number of fixes for security issues affecting 4.9-
stable. All of these are already fixed in 4.14-stable and 4.19-stable.
Most of the issues involve filesystem validation, and I tested with the
reproducers where available.
For the BPF fix, I verified that the self-tests (taken from 4.14)
didn't regress and temporarily added logging to check that the
mitigation is applied when needed.
Ben.
--
Ben Hutchings, Software Developer Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
Manchester, M1 2HF, United Kingdom
From: Michal Hocko <mhocko(a)suse.com>
We have received a bug report that an injected MCE about faulty memory
prevents memory offline to succeed. The underlying reason is that the
HWPoison page has an elevated reference count and the migration keeps
failing. There are two problems with that. First of all it is dubious
to migrate the poisoned page because we know that accessing that memory
is possible to fail. Secondly it doesn't make any sense to migrate a
potentially broken content and preserve the memory corruption over to a
new location.
Oscar has found out that it is the elevated reference count from
memory_failure that is confusing the offlining path. HWPoisoned pages
are isolated from the LRU list but __offline_pages might still try to
migrate them if there is any preceding migrateable pages in the pfn
range. Such a migration would fail due to the reference count but
the migration code would put it back on the LRU list. This is quite
wrong in itself but it would also make scan_movable_pages stumble over
it again without any way out.
This means that the hotremove with hwpoisoned pages has never really
worked (without a luck). HWPoisoning really needs a larger surgery
but an immediate and backportable fix is to skip over these pages during
offlining. Even if they are still mapped for some reason then
try_to_unmap should turn those mappings into hwpoison ptes and cause
SIGBUS on access. Nobody should be really touching the content of the
page so it should be safe to ignore them even when there is a pending
reference count.
Debugged-by: Oscar Salvador <osalvador(a)suse.com>
Cc: stable
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
---
Hi,
I am sending this as an RFC now because I am not fully sure I see all
the consequences myself yet. This has passed a testing by Oscar but I
would highly appreciate a review from Naoya about my assumptions about
hwpoisoning. E.g. it is not entirely clear to me whether there is a
potential case where the page might be still mapped. I have put
try_to_unmap just to be sure. It would be really great if I could drop
that part because then it is not really great which of the TTU flags to
use to cover all potential cases.
I have marked the patch for stable but I have no idea how far back it
should go. Probably everything that already has hotremove and hwpoison
code.
Thanks in advance!
mm/memory_hotplug.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c6c42a7425e5..08c576d5a633 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -34,6 +34,7 @@
#include <linux/hugetlb.h>
#include <linux/memblock.h>
#include <linux/compaction.h>
+#include <linux/rmap.h>
#include <asm/tlbflush.h>
@@ -1366,6 +1367,17 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
pfn = page_to_pfn(compound_head(page))
+ hpage_nr_pages(page) - 1;
+ /*
+ * HWPoison pages have elevated reference counts so the migration would
+ * fail on them. It also doesn't make any sense to migrate them in the
+ * first place. Still try to unmap such a page in case it is still mapped.
+ */
+ if (PageHWPoison(page)) {
+ if (page_mapped(page))
+ try_to_unmap(page, TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS);
+ continue;
+ }
+
if (!get_page_unless_zero(page))
continue;
/*
--
2.19.1
From: Michal Hocko <mhocko(a)suse.com>
We have received a bug report that an injected MCE about faulty memory
prevents memory offline to succeed on 4.4 base kernel. The underlying
reason was that the HWPoison page has an elevated reference count and
the migration keeps failing. There are two problems with that. First
of all it is dubious to migrate the poisoned page because we know that
accessing that memory is possible to fail. Secondly it doesn't make any
sense to migrate a potentially broken content and preserve the memory
corruption over to a new location.
Oscar has found out that 4.4 and the current upstream kernels behave
slightly differently with his simply testcase
===
int main(void)
{
int ret;
int i;
int fd;
char *array = malloc(4096);
char *array_locked = malloc(4096);
fd = open("/tmp/data", O_RDONLY);
read(fd, array, 4095);
for (i = 0; i < 4096; i++)
array_locked[i] = 'd';
ret = mlock((void *)PAGE_ALIGN((unsigned long)array_locked), sizeof(array_locked));
if (ret)
perror("mlock");
sleep (20);
ret = madvise((void *)PAGE_ALIGN((unsigned long)array_locked), 4096, MADV_HWPOISON);
if (ret)
perror("madvise");
for (i = 0; i < 4096; i++)
array_locked[i] = 'd';
return 0;
}
===
+ offline this memory.
In 4.4 kernels he saw the hwpoisoned page to be returned back to the LRU
list
kernel: [<ffffffff81019ac9>] dump_trace+0x59/0x340
kernel: [<ffffffff81019e9a>] show_stack_log_lvl+0xea/0x170
kernel: [<ffffffff8101ac71>] show_stack+0x21/0x40
kernel: [<ffffffff8132bb90>] dump_stack+0x5c/0x7c
kernel: [<ffffffff810815a1>] warn_slowpath_common+0x81/0xb0
kernel: [<ffffffff811a275c>] __pagevec_lru_add_fn+0x14c/0x160
kernel: [<ffffffff811a2eed>] pagevec_lru_move_fn+0xad/0x100
kernel: [<ffffffff811a334c>] __lru_cache_add+0x6c/0xb0
kernel: [<ffffffff81195236>] add_to_page_cache_lru+0x46/0x70
kernel: [<ffffffffa02b4373>] extent_readpages+0xc3/0x1a0 [btrfs]
kernel: [<ffffffff811a16d7>] __do_page_cache_readahead+0x177/0x200
kernel: [<ffffffff811a18c8>] ondemand_readahead+0x168/0x2a0
kernel: [<ffffffff8119673f>] generic_file_read_iter+0x41f/0x660
kernel: [<ffffffff8120e50d>] __vfs_read+0xcd/0x140
kernel: [<ffffffff8120e9ea>] vfs_read+0x7a/0x120
kernel: [<ffffffff8121404b>] kernel_read+0x3b/0x50
kernel: [<ffffffff81215c80>] do_execveat_common.isra.29+0x490/0x6f0
kernel: [<ffffffff81215f08>] do_execve+0x28/0x30
kernel: [<ffffffff81095ddb>] call_usermodehelper_exec_async+0xfb/0x130
kernel: [<ffffffff8161c045>] ret_from_fork+0x55/0x80
And that later confuses the hotremove path because an LRU page is
attempted to be migrated and that fails due to an elevated reference
count. It is quite possible that the reuse of the HWPoisoned page is
some kind of fixed race condition but I am not really sure about that.
With the upstream kernel the failure is slightly different. The page
doesn't seem to have LRU bit set but isolate_movable_page simply fails
and do_migrate_range simply puts all the isolated pages back to LRU and
therefore no progress is made and scan_movable_pages finds same set of
pages over and over again.
Fix both cases by explicitly checking HWPoisoned pages before we even
try to get a reference on the page, try to unmap it if it is still
mapped. As explained by Naoya
: Hwpoison code never unmapped those for no big reason because
: Ksm pages never dominate memory, so we simply didn't have strong
: motivation to save the pages.
Also put WARN_ON(PageLRU) in case there is a race and we can hit LRU
HWPoison pages which shouldn't happen but I couldn't convince myself
about that. Naoya has noted the following
: Theoretically no such gurantee, because try_to_unmap() doesn't have a
: guarantee of success and then memory_failure() returns immediately
: when hwpoison_user_mappings fails.
: Or the following code (comes after hwpoison_user_mappings block) also impli=
: es
: that the target page can still have PageLRU flag.
:
: /*
: * Torn down by someone else?
: */
: if (PageLRU(p) && !PageSwapCache(p) && p->mapping =3D=3D NULL) {
: action_result(pfn, MF_MSG_TRUNCATED_LRU, MF_IGNORED);
: res =3D -EBUSY;
: goto out;
: }
:
: So I think it's OK to keep "if (WARN_ON(PageLRU(page)))" block in
: current version of your patch.
Debugged-by: Oscar Salvador <osalvador(a)suse.com>
Cc: stable
Reviewed-by: Oscar Salvador <osalvador(a)suse.com>
Tested-by: Oscar Salvador <osalvador(a)suse.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
---
Hi Andrew,
this has been posted as an RFC [1] previously. It took 2 versions to get
the patch right but it seems that this one should work reasonably well.
I guess we want to have it in linux-next for some time but I do not
expect many people do test MCEs + hotremove considering the breakage is
old and nobody has noticed so far.
[1] http://lkml.kernel.org/r/20181203100309.14784-1-mhocko@kernel.org
mm/memory_hotplug.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c6c42a7425e5..cfa1a2736876 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -34,6 +34,7 @@
#include <linux/hugetlb.h>
#include <linux/memblock.h>
#include <linux/compaction.h>
+#include <linux/rmap.h>
#include <asm/tlbflush.h>
@@ -1366,6 +1367,21 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
pfn = page_to_pfn(compound_head(page))
+ hpage_nr_pages(page) - 1;
+ /*
+ * HWPoison pages have elevated reference counts so the migration would
+ * fail on them. It also doesn't make any sense to migrate them in the
+ * first place. Still try to unmap such a page in case it is still mapped
+ * (e.g. current hwpoison implementation doesn't unmap KSM pages but keep
+ * the unmap as the catch all safety net).
+ */
+ if (PageHWPoison(page)) {
+ if (WARN_ON(PageLRU(page)))
+ isolate_lru_page(page);
+ if (page_mapped(page))
+ try_to_unmap(page, TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS);
+ continue;
+ }
+
if (!get_page_unless_zero(page))
continue;
/*
--
2.19.2
This is a note to let you know that I've just added the patch titled
xhci: workaround CSS timeout on AMD SNPS 3.0 xHC
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From a7d57abcc8a5bdeb53bbf8e87558e8e0a2c2a29d Mon Sep 17 00:00:00 2001
From: Sandeep Singh <sandeep.singh(a)amd.com>
Date: Wed, 5 Dec 2018 14:22:38 +0200
Subject: xhci: workaround CSS timeout on AMD SNPS 3.0 xHC
Occasionally AMD SNPS 3.0 xHC does not respond to
CSS when set, also it does not flag anything on SRE and HCE
to point the internal xHC errors on USBSTS register. This stalls
the entire system wide suspend and there is no point in stalling
just because of xHC CSS is not responding.
To work around this problem, if the xHC does not flag
anything on SRE and HCE, we can skip the CSS
timeout and allow the system to continue the suspend. Once the
system resume happens we can internally reset the controller
using XHCI_RESET_ON_RESUME quirk
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k(a)amd.com>
Signed-off-by: Sandeep Singh <Sandeep.Singh(a)amd.com>
cc: Nehal Shah <Nehal-bakulchandra.Shah(a)amd.com>
Cc: <stable(a)vger.kernel.org>
Tested-by: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci-pci.c | 4 ++++
drivers/usb/host/xhci.c | 26 ++++++++++++++++++++++----
drivers/usb/host/xhci.h | 3 +++
3 files changed, 29 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index a9515265db4d..a9ec7051f286 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -139,6 +139,10 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
pdev->device == 0x43bb))
xhci->quirks |= XHCI_SUSPEND_DELAY;
+ if (pdev->vendor == PCI_VENDOR_ID_AMD &&
+ (pdev->device == 0x15e0 || pdev->device == 0x15e1))
+ xhci->quirks |= XHCI_SNPS_BROKEN_SUSPEND;
+
if (pdev->vendor == PCI_VENDOR_ID_AMD)
xhci->quirks |= XHCI_TRUST_TX_LENGTH;
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index c928dbbff881..c20b85e28d81 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -968,6 +968,7 @@ int xhci_suspend(struct xhci_hcd *xhci, bool do_wakeup)
unsigned int delay = XHCI_MAX_HALT_USEC;
struct usb_hcd *hcd = xhci_to_hcd(xhci);
u32 command;
+ u32 res;
if (!hcd->state)
return 0;
@@ -1021,11 +1022,28 @@ int xhci_suspend(struct xhci_hcd *xhci, bool do_wakeup)
command = readl(&xhci->op_regs->command);
command |= CMD_CSS;
writel(command, &xhci->op_regs->command);
+ xhci->broken_suspend = 0;
if (xhci_handshake(&xhci->op_regs->status,
STS_SAVE, 0, 10 * 1000)) {
- xhci_warn(xhci, "WARN: xHC save state timeout\n");
- spin_unlock_irq(&xhci->lock);
- return -ETIMEDOUT;
+ /*
+ * AMD SNPS xHC 3.0 occasionally does not clear the
+ * SSS bit of USBSTS and when driver tries to poll
+ * to see if the xHC clears BIT(8) which never happens
+ * and driver assumes that controller is not responding
+ * and times out. To workaround this, its good to check
+ * if SRE and HCE bits are not set (as per xhci
+ * Section 5.4.2) and bypass the timeout.
+ */
+ res = readl(&xhci->op_regs->status);
+ if ((xhci->quirks & XHCI_SNPS_BROKEN_SUSPEND) &&
+ (((res & STS_SRE) == 0) &&
+ ((res & STS_HCE) == 0))) {
+ xhci->broken_suspend = 1;
+ } else {
+ xhci_warn(xhci, "WARN: xHC save state timeout\n");
+ spin_unlock_irq(&xhci->lock);
+ return -ETIMEDOUT;
+ }
}
spin_unlock_irq(&xhci->lock);
@@ -1078,7 +1096,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
set_bit(HCD_FLAG_HW_ACCESSIBLE, &xhci->shared_hcd->flags);
spin_lock_irq(&xhci->lock);
- if (xhci->quirks & XHCI_RESET_ON_RESUME)
+ if ((xhci->quirks & XHCI_RESET_ON_RESUME) || xhci->broken_suspend)
hibernated = true;
if (!hibernated) {
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 260b259b72bc..c3515bad5dbb 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1850,6 +1850,7 @@ struct xhci_hcd {
#define XHCI_ZERO_64B_REGS BIT_ULL(32)
#define XHCI_DEFAULT_PM_RUNTIME_ALLOW BIT_ULL(33)
#define XHCI_RESET_PLL_ON_DISCONNECT BIT_ULL(34)
+#define XHCI_SNPS_BROKEN_SUSPEND BIT_ULL(35)
unsigned int num_active_eps;
unsigned int limit_active_eps;
@@ -1879,6 +1880,8 @@ struct xhci_hcd {
void *dbc;
/* platform-specific data -- must come last */
unsigned long priv[0] __aligned(sizeof(s64));
+ /* Broken Suspend flag for SNPS Suspend resume issue */
+ u8 broken_suspend;
};
/* Platform specific overrides to generic XHCI hc_driver ops */
--
2.19.2
This is a note to let you know that I've just added the patch titled
xhci: Prevent U1/U2 link pm states if exit latency is too long
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 0472bf06c6fd33c1a18aaead4c8f91e5a03d8d7b Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Wed, 5 Dec 2018 14:22:39 +0200
Subject: xhci: Prevent U1/U2 link pm states if exit latency is too long
Don't allow USB3 U1 or U2 if the latency to wake up from the U-state
reaches the service interval for a periodic endpoint.
This is according to xhci 1.1 specification section 4.23.5.2 extra note:
"Software shall ensure that a device is prevented from entering a U-state
where its worst case exit latency approaches the ESIT."
Allowing too long exit latencies for periodic endpoint confuses xHC
internal scheduling, and new devices may fail to enumerate with a
"Not enough bandwidth for new device state" error from the host.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index c20b85e28d81..dae3be1b9c8f 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4514,6 +4514,14 @@ static u16 xhci_calculate_u1_timeout(struct xhci_hcd *xhci,
{
unsigned long long timeout_ns;
+ /* Prevent U1 if service interval is shorter than U1 exit latency */
+ if (usb_endpoint_xfer_int(desc) || usb_endpoint_xfer_isoc(desc)) {
+ if (xhci_service_interval_to_ns(desc) <= udev->u1_params.mel) {
+ dev_dbg(&udev->dev, "Disable U1, ESIT shorter than exit latency\n");
+ return USB3_LPM_DISABLED;
+ }
+ }
+
if (xhci->quirks & XHCI_INTEL_HOST)
timeout_ns = xhci_calculate_intel_u1_timeout(udev, desc);
else
@@ -4570,6 +4578,14 @@ static u16 xhci_calculate_u2_timeout(struct xhci_hcd *xhci,
{
unsigned long long timeout_ns;
+ /* Prevent U2 if service interval is shorter than U2 exit latency */
+ if (usb_endpoint_xfer_int(desc) || usb_endpoint_xfer_isoc(desc)) {
+ if (xhci_service_interval_to_ns(desc) <= udev->u2_params.mel) {
+ dev_dbg(&udev->dev, "Disable U2, ESIT shorter than exit latency\n");
+ return USB3_LPM_DISABLED;
+ }
+ }
+
if (xhci->quirks & XHCI_INTEL_HOST)
timeout_ns = xhci_calculate_intel_u2_timeout(udev, desc);
else
--
2.19.2
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 104f708fd1241b22f808bdf066ab67dc5a051de5 Mon Sep 17 00:00:00 2001
From: Harald Freudenberger <freude(a)linux.ibm.com>
Date: Fri, 9 Nov 2018 14:59:24 +0100
Subject: [PATCH] s390/zcrypt: reinit ap queue state machine during device
probe
Until the vfio-ap driver came into live there was a well known
agreement about the way how ap devices are initialized and their
states when the driver's probe function is called.
However, the vfio device driver when receiving an ap queue device does
additional resets thereby removing the registration for interrupts for
the ap device done by the ap bus core code. So when later the vfio
driver releases the device and one of the default zcrypt drivers takes
care of the device the interrupt registration needs to get
renewed. The current code does no renew and result is that requests
send into such a queue will never see a reply processed - the
application hangs.
This patch adds a function which resets the aq queue state machine for
the ap queue device and triggers the walk through the initial states
(which are reset and registration for interrupts). This function is
now called before the driver's probe function is invoked.
When the association between driver and device is released, the
driver's remove function is called. The current implementation calls a
ap queue function ap_queue_remove(). This invokation has been moved to
the ap bus function to make the probe / remove pair for ap bus and
drivers more symmetric.
Fixes: 7e0bdbe5c21c ("s390/zcrypt: AP bus support for alternate driver(s)")
Cc: stable(a)vger.kernel.org # 4.19+
Signed-off-by: Harald Freudenberger <freude(a)linux.ibm.com>
Reviewd-by: Tony Krowiak <akrowiak(a)linux.ibm.com>
Reviewd-by: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
index 048665e4f13d..9f5a201c4c87 100644
--- a/drivers/s390/crypto/ap_bus.c
+++ b/drivers/s390/crypto/ap_bus.c
@@ -775,6 +775,8 @@ static int ap_device_probe(struct device *dev)
drvres = ap_drv->flags & AP_DRIVER_FLAG_DEFAULT;
if (!!devres != !!drvres)
return -ENODEV;
+ /* (re-)init queue's state machine */
+ ap_queue_reinit_state(to_ap_queue(dev));
}
/* Add queue/card to list of active queues/cards */
@@ -807,6 +809,8 @@ static int ap_device_remove(struct device *dev)
struct ap_device *ap_dev = to_ap_dev(dev);
struct ap_driver *ap_drv = ap_dev->drv;
+ if (is_queue_dev(dev))
+ ap_queue_remove(to_ap_queue(dev));
if (ap_drv->remove)
ap_drv->remove(ap_dev);
@@ -1444,10 +1448,6 @@ static void ap_scan_bus(struct work_struct *unused)
aq->ap_dev.device.parent = &ac->ap_dev.device;
dev_set_name(&aq->ap_dev.device,
"%02x.%04x", id, dom);
- /* Start with a device reset */
- spin_lock_bh(&aq->lock);
- ap_wait(ap_sm_event(aq, AP_EVENT_POLL));
- spin_unlock_bh(&aq->lock);
/* Register device */
rc = device_register(&aq->ap_dev.device);
if (rc) {
diff --git a/drivers/s390/crypto/ap_bus.h b/drivers/s390/crypto/ap_bus.h
index 3eed1b36c876..bfc66e4a9de1 100644
--- a/drivers/s390/crypto/ap_bus.h
+++ b/drivers/s390/crypto/ap_bus.h
@@ -254,6 +254,7 @@ struct ap_queue *ap_queue_create(ap_qid_t qid, int device_type);
void ap_queue_remove(struct ap_queue *aq);
void ap_queue_suspend(struct ap_device *ap_dev);
void ap_queue_resume(struct ap_device *ap_dev);
+void ap_queue_reinit_state(struct ap_queue *aq);
struct ap_card *ap_card_create(int id, int queue_depth, int raw_device_type,
int comp_device_type, unsigned int functions);
diff --git a/drivers/s390/crypto/ap_queue.c b/drivers/s390/crypto/ap_queue.c
index 66f7334bcb03..0aa4b3ccc948 100644
--- a/drivers/s390/crypto/ap_queue.c
+++ b/drivers/s390/crypto/ap_queue.c
@@ -718,5 +718,20 @@ void ap_queue_remove(struct ap_queue *aq)
{
ap_flush_queue(aq);
del_timer_sync(&aq->timeout);
+
+ /* reset with zero, also clears irq registration */
+ spin_lock_bh(&aq->lock);
+ ap_zapq(aq->qid);
+ aq->state = AP_STATE_BORKED;
+ spin_unlock_bh(&aq->lock);
}
EXPORT_SYMBOL(ap_queue_remove);
+
+void ap_queue_reinit_state(struct ap_queue *aq)
+{
+ spin_lock_bh(&aq->lock);
+ aq->state = AP_STATE_RESET_START;
+ ap_wait(ap_sm_event(aq, AP_EVENT_POLL));
+ spin_unlock_bh(&aq->lock);
+}
+EXPORT_SYMBOL(ap_queue_reinit_state);
diff --git a/drivers/s390/crypto/zcrypt_cex2a.c b/drivers/s390/crypto/zcrypt_cex2a.c
index 146f54f5cbb8..c50f3e86cc74 100644
--- a/drivers/s390/crypto/zcrypt_cex2a.c
+++ b/drivers/s390/crypto/zcrypt_cex2a.c
@@ -196,7 +196,6 @@ static void zcrypt_cex2a_queue_remove(struct ap_device *ap_dev)
struct ap_queue *aq = to_ap_queue(&ap_dev->device);
struct zcrypt_queue *zq = aq->private;
- ap_queue_remove(aq);
if (zq)
zcrypt_queue_unregister(zq);
}
diff --git a/drivers/s390/crypto/zcrypt_cex2c.c b/drivers/s390/crypto/zcrypt_cex2c.c
index 546f67676734..35c7c6672713 100644
--- a/drivers/s390/crypto/zcrypt_cex2c.c
+++ b/drivers/s390/crypto/zcrypt_cex2c.c
@@ -251,7 +251,6 @@ static void zcrypt_cex2c_queue_remove(struct ap_device *ap_dev)
struct ap_queue *aq = to_ap_queue(&ap_dev->device);
struct zcrypt_queue *zq = aq->private;
- ap_queue_remove(aq);
if (zq)
zcrypt_queue_unregister(zq);
}
diff --git a/drivers/s390/crypto/zcrypt_cex4.c b/drivers/s390/crypto/zcrypt_cex4.c
index f9d4c6c7521d..582ffa7e0f18 100644
--- a/drivers/s390/crypto/zcrypt_cex4.c
+++ b/drivers/s390/crypto/zcrypt_cex4.c
@@ -275,7 +275,6 @@ static void zcrypt_cex4_queue_remove(struct ap_device *ap_dev)
struct ap_queue *aq = to_ap_queue(&ap_dev->device);
struct zcrypt_queue *zq = aq->private;
- ap_queue_remove(aq);
if (zq)
zcrypt_queue_unregister(zq);
}
Yongqin reported that /proc/zoneinfo format is broken in 4.14
due to commit 7aaf77272358 ("mm: don't show nr_indirectly_reclaimable
in /proc/vmstat")
Node 0, zone DMA
per-node stats
nr_inactive_anon 403
nr_active_anon 89123
nr_inactive_file 128887
nr_active_file 47377
nr_unevictable 2053
nr_slab_reclaimable 7510
nr_slab_unreclaimable 10775
nr_isolated_anon 0
nr_isolated_file 0
<...>
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_dirtied 6022
nr_written 5985
74240
^^^^^^^^^^
pages free 131656
The problem is caused by the nr_indirectly_reclaimable counter,
which is hidden from the /proc/vmstat, but not from the
/proc/zoneinfo. Let's fix this inconsistency and hide the
counter from /proc/zoneinfo exactly as from /proc/vmstat.
BTW, in 4.19+ the counter has been renamed and exported by
the commit b29940c1abd7 ("mm: rename and change semantics of
nr_indirectly_reclaimable_bytes"), so there is no such a problem
anymore.
Cc: <stable(a)vger.kernel.org> # 4.14.x-4.18.x
Fixes: 7aaf77272358 ("mm: don't show nr_indirectly_reclaimable in /proc/vmstat")
Reported-by: Yongqin Liu <yongqin.liu(a)linaro.org>
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmstat.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 527ae727d547..6389e876c7a7 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1500,6 +1500,10 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
if (is_zone_first_populated(pgdat, zone)) {
seq_printf(m, "\n per-node stats");
for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) {
+ /* Skip hidden vmstat items. */
+ if (*vmstat_text[i + NR_VM_ZONE_STAT_ITEMS +
+ NR_VM_NUMA_STAT_ITEMS] == '\0')
+ continue;
seq_printf(m, "\n %-12s %lu",
vmstat_text[i + NR_VM_ZONE_STAT_ITEMS +
NR_VM_NUMA_STAT_ITEMS],
--
2.17.2
Hello Greg,
Could you please consider backporting to v4.4 the following commit:
commit b5b5b6fe643391209b08528bef410e0cf299b826
Author: Simon Guo <wei.guo.simon(a)gmail.com>
Date: Fri Oct 7 20:59:40 2016
mm: mlock: avoid increase mm->locked_vm on mlock() when already
mlock2(,MLOCK_ONFAULT)
It seems to be a trivial fix for:
https://bugs.linaro.org/show_bug.cgi?id=4043
(mlock203.c LTP test failing on v4.4)
Thanks in advance,
--
Rafael D. Tinoco
Linaro Kernel Validation Team
Backported mainline commit 6ff38bd40230af35
("mm: cleancache: fix corruption on missed inode invalidation")
From: Pavel Tikhomirov <ptikhomirov(a)virtuozzo.com>
Date: Fri, 30 Nov 2018 14:09:00 -0800
Subject: [PATCH] mm: cleancache: fix corruption on missed inode invalidation
If all pages are deleted from the mapping by memory reclaim and also
moved to the cleancache:
__delete_from_page_cache
(no shadow case)
unaccount_page_cache_page
cleancache_put_page
page_cache_delete
mapping->nrpages -= nr
(nrpages becomes 0)
We don't clean the cleancache for an inode after final file truncation
(removal).
truncate_inode_pages_final
check (nrpages || nrexceptional) is false
no truncate_inode_pages
no cleancache_invalidate_inode(mapping)
These way when reading the new file created with same inode we may get
these trash leftover pages from cleancache and see wrong data instead of
the contents of the new file.
Fix it by always doing truncate_inode_pages which is already ready for
nrpages == 0 && nrexceptional == 0 case and just invalidates inode.
[akpm(a)linux-foundation.org: add comment, per Jan]
Link: http://lkml.kernel.org/r/20181112095734.17979-1-ptikhomirov@virtuozzo.com
Fixes: commit 91b0abe36a7b ("mm + fs: store shadow entries in page cache")
Signed-off-by: Pavel Tikhomirov <ptikhomirov(a)virtuozzo.com>
Reviewed-by: Vasily Averin <vvs(a)virtuozzo.com>
Reviewed-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Vasily Averin <vvs(a)virtuozzo.com>
---
mm/truncate.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/mm/truncate.c b/mm/truncate.c
index 2330223841fb..d3a2737cc188 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -471,9 +471,13 @@ void truncate_inode_pages_final(struct address_space *mapping)
*/
spin_lock_irq(&mapping->tree_lock);
spin_unlock_irq(&mapping->tree_lock);
-
- truncate_inode_pages(mapping, 0);
}
+
+ /*
+ * Cleancache needs notification even if there are no pages or shadow
+ * entries.
+ */
+ truncate_inode_pages(mapping, 0);
}
EXPORT_SYMBOL(truncate_inode_pages_final);
--
2.17.1
Hi Greg,
This has been applied to 4.19-stable, but should be needed in
4.14-stable also. I have attached the backported patch and have
also tested with one such DVD where the problem was seen.
--
Regards
Sudip
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f8397d69daef06d358430d3054662fb597e37c00 Mon Sep 17 00:00:00 2001
From: Nikolay Borisov <nborisov(a)suse.com>
Date: Tue, 6 Nov 2018 16:40:20 +0200
Subject: [PATCH] btrfs: Always try all copies when reading extent buffers
When a metadata read is served the endio routine btree_readpage_end_io_hook
is called which eventually runs the tree-checker. If tree-checker fails
to validate the read eb then it sets EXTENT_BUFFER_CORRUPT flag. This
leads to btree_read_extent_buffer_pages wrongly assuming that all
available copies of this extent buffer are wrong and failing prematurely.
Fix this modify btree_read_extent_buffer_pages to read all copies of
the data.
This failure was exhibitted in xfstests btrfs/124 which would
spuriously fail its balance operations. The reason was that when balance
was run following re-introduction of the missing raid1 disk
__btrfs_map_block would map the read request to stripe 0, which
corresponded to devid 2 (the disk which is being removed in the test):
item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 3553624064) itemoff 15975 itemsize 112
length 1073741824 owner 2 stripe_len 65536 type DATA|RAID1
io_align 65536 io_width 65536 sector_size 4096
num_stripes 2 sub_stripes 1
stripe 0 devid 2 offset 2156920832
dev_uuid 8466c350-ed0c-4c3b-b17d-6379b445d5c8
stripe 1 devid 1 offset 3553624064
dev_uuid 1265d8db-5596-477e-af03-df08eb38d2ca
This caused read requests for a checksum item that to be routed to the
stale disk which triggered the aforementioned logic involving
EXTENT_BUFFER_CORRUPT flag. This then triggered cascading failures of
the balance operation.
Fixes: a826d6dcb32d ("Btrfs: check items for correctness as we search")
CC: stable(a)vger.kernel.org # 4.4+
Suggested-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: Nikolay Borisov <nborisov(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 3f0b6d1936e8..6d776717d8b3 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -477,9 +477,9 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
int mirror_num = 0;
int failed_mirror = 0;
- clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
io_tree = &BTRFS_I(fs_info->btree_inode)->io_tree;
while (1) {
+ clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
ret = read_extent_buffer_pages(io_tree, eb, WAIT_COMPLETE,
mirror_num);
if (!ret) {
@@ -493,15 +493,6 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
break;
}
- /*
- * This buffer's crc is fine, but its contents are corrupted, so
- * there is no reason to read the other copies, they won't be
- * any less wrong.
- */
- if (test_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags) ||
- ret == -EUCLEAN)
- break;
-
num_copies = btrfs_num_copies(fs_info,
eb->start, eb->len);
if (num_copies == 1)
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f8397d69daef06d358430d3054662fb597e37c00 Mon Sep 17 00:00:00 2001
From: Nikolay Borisov <nborisov(a)suse.com>
Date: Tue, 6 Nov 2018 16:40:20 +0200
Subject: [PATCH] btrfs: Always try all copies when reading extent buffers
When a metadata read is served the endio routine btree_readpage_end_io_hook
is called which eventually runs the tree-checker. If tree-checker fails
to validate the read eb then it sets EXTENT_BUFFER_CORRUPT flag. This
leads to btree_read_extent_buffer_pages wrongly assuming that all
available copies of this extent buffer are wrong and failing prematurely.
Fix this modify btree_read_extent_buffer_pages to read all copies of
the data.
This failure was exhibitted in xfstests btrfs/124 which would
spuriously fail its balance operations. The reason was that when balance
was run following re-introduction of the missing raid1 disk
__btrfs_map_block would map the read request to stripe 0, which
corresponded to devid 2 (the disk which is being removed in the test):
item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 3553624064) itemoff 15975 itemsize 112
length 1073741824 owner 2 stripe_len 65536 type DATA|RAID1
io_align 65536 io_width 65536 sector_size 4096
num_stripes 2 sub_stripes 1
stripe 0 devid 2 offset 2156920832
dev_uuid 8466c350-ed0c-4c3b-b17d-6379b445d5c8
stripe 1 devid 1 offset 3553624064
dev_uuid 1265d8db-5596-477e-af03-df08eb38d2ca
This caused read requests for a checksum item that to be routed to the
stale disk which triggered the aforementioned logic involving
EXTENT_BUFFER_CORRUPT flag. This then triggered cascading failures of
the balance operation.
Fixes: a826d6dcb32d ("Btrfs: check items for correctness as we search")
CC: stable(a)vger.kernel.org # 4.4+
Suggested-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: Nikolay Borisov <nborisov(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 3f0b6d1936e8..6d776717d8b3 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -477,9 +477,9 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
int mirror_num = 0;
int failed_mirror = 0;
- clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
io_tree = &BTRFS_I(fs_info->btree_inode)->io_tree;
while (1) {
+ clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
ret = read_extent_buffer_pages(io_tree, eb, WAIT_COMPLETE,
mirror_num);
if (!ret) {
@@ -493,15 +493,6 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
break;
}
- /*
- * This buffer's crc is fine, but its contents are corrupted, so
- * there is no reason to read the other copies, they won't be
- * any less wrong.
- */
- if (test_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags) ||
- ret == -EUCLEAN)
- break;
-
num_copies = btrfs_num_copies(fs_info,
eb->start, eb->len);
if (num_copies == 1)
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f8397d69daef06d358430d3054662fb597e37c00 Mon Sep 17 00:00:00 2001
From: Nikolay Borisov <nborisov(a)suse.com>
Date: Tue, 6 Nov 2018 16:40:20 +0200
Subject: [PATCH] btrfs: Always try all copies when reading extent buffers
When a metadata read is served the endio routine btree_readpage_end_io_hook
is called which eventually runs the tree-checker. If tree-checker fails
to validate the read eb then it sets EXTENT_BUFFER_CORRUPT flag. This
leads to btree_read_extent_buffer_pages wrongly assuming that all
available copies of this extent buffer are wrong and failing prematurely.
Fix this modify btree_read_extent_buffer_pages to read all copies of
the data.
This failure was exhibitted in xfstests btrfs/124 which would
spuriously fail its balance operations. The reason was that when balance
was run following re-introduction of the missing raid1 disk
__btrfs_map_block would map the read request to stripe 0, which
corresponded to devid 2 (the disk which is being removed in the test):
item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 3553624064) itemoff 15975 itemsize 112
length 1073741824 owner 2 stripe_len 65536 type DATA|RAID1
io_align 65536 io_width 65536 sector_size 4096
num_stripes 2 sub_stripes 1
stripe 0 devid 2 offset 2156920832
dev_uuid 8466c350-ed0c-4c3b-b17d-6379b445d5c8
stripe 1 devid 1 offset 3553624064
dev_uuid 1265d8db-5596-477e-af03-df08eb38d2ca
This caused read requests for a checksum item that to be routed to the
stale disk which triggered the aforementioned logic involving
EXTENT_BUFFER_CORRUPT flag. This then triggered cascading failures of
the balance operation.
Fixes: a826d6dcb32d ("Btrfs: check items for correctness as we search")
CC: stable(a)vger.kernel.org # 4.4+
Suggested-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: Nikolay Borisov <nborisov(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 3f0b6d1936e8..6d776717d8b3 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -477,9 +477,9 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
int mirror_num = 0;
int failed_mirror = 0;
- clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
io_tree = &BTRFS_I(fs_info->btree_inode)->io_tree;
while (1) {
+ clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
ret = read_extent_buffer_pages(io_tree, eb, WAIT_COMPLETE,
mirror_num);
if (!ret) {
@@ -493,15 +493,6 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
break;
}
- /*
- * This buffer's crc is fine, but its contents are corrupted, so
- * there is no reason to read the other copies, they won't be
- * any less wrong.
- */
- if (test_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags) ||
- ret == -EUCLEAN)
- break;
-
num_copies = btrfs_num_copies(fs_info,
eb->start, eb->len);
if (num_copies == 1)
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ceff2f4dcd44abf35864d9a99f85ac619e89a01d Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Mon, 27 Aug 2018 10:21:46 +0200
Subject: [PATCH] drm/mediatek: fix OF sibling-node lookup
Use the new of_get_compatible_child() helper to lookup the sibling
instead of using of_find_compatible_node(), which searches the entire
tree from a given start node and thus can return an unrelated (i.e.
non-sibling) node.
This also addresses a potential use-after-free (e.g. after probe
deferral) as the tree-wide helper drops a reference to its first
argument (i.e. the parent device node).
While at it, also fix the related cec-node reference leak.
Fixes: 8f83f26891e1 ("drm/mediatek: Add HDMI support")
Cc: stable <stable(a)vger.kernel.org> # 4.8
Cc: Junzhi Zhao <junzhi.zhao(a)mediatek.com>
Cc: Philipp Zabel <p.zabel(a)pengutronix.de>
Cc: CK Hu <ck.hu(a)mediatek.com>
Cc: David Airlie <airlied(a)linux.ie>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Rob Herring <robh(a)kernel.org>
diff --git a/drivers/gpu/drm/mediatek/mtk_hdmi.c b/drivers/gpu/drm/mediatek/mtk_hdmi.c
index 2d45d1dd9554..643f5edd68fe 100644
--- a/drivers/gpu/drm/mediatek/mtk_hdmi.c
+++ b/drivers/gpu/drm/mediatek/mtk_hdmi.c
@@ -1446,8 +1446,7 @@ static int mtk_hdmi_dt_parse_pdata(struct mtk_hdmi *hdmi,
}
/* The CEC module handles HDMI hotplug detection */
- cec_np = of_find_compatible_node(np->parent, NULL,
- "mediatek,mt8173-cec");
+ cec_np = of_get_compatible_child(np->parent, "mediatek,mt8173-cec");
if (!cec_np) {
dev_err(dev, "Failed to find CEC node\n");
return -EINVAL;
@@ -1457,8 +1456,10 @@ static int mtk_hdmi_dt_parse_pdata(struct mtk_hdmi *hdmi,
if (!cec_pdev) {
dev_err(hdmi->dev, "Waiting for CEC device %pOF\n",
cec_np);
+ of_node_put(cec_np);
return -EPROBE_DEFER;
}
+ of_node_put(cec_np);
hdmi->cec_dev = &cec_pdev->dev;
/*
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f9a7082327e26f54067a49cac2316d31e0cc8ba7 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Mon, 27 Aug 2018 10:21:47 +0200
Subject: [PATCH] drm/msm: fix OF child-node lookup
Use the new of_get_compatible_child() helper to lookup the legacy
pwrlevels child node instead of using of_find_compatible_node(), which
searches the entire tree from a given start node and thus can return an
unrelated (i.e. non-child) node.
This also addresses a potential use-after-free (e.g. after probe
deferral) as the tree-wide helper drops a reference to its first
argument (i.e. the probed device's node).
While at it, also fix the related child-node reference leak.
Fixes: e2af8b6b0ca1 ("drm/msm: gpu: Use OPP tables if we can")
Cc: stable <stable(a)vger.kernel.org> # 4.12
Cc: Jordan Crouse <jcrouse(a)codeaurora.org>
Cc: Rob Clark <robdclark(a)gmail.com>
Cc: David Airlie <airlied(a)linux.ie>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Rob Herring <robh(a)kernel.org>
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index da1363a0c54d..93d70f4a2154 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -633,8 +633,7 @@ static int adreno_get_legacy_pwrlevels(struct device *dev)
struct device_node *child, *node;
int ret;
- node = of_find_compatible_node(dev->of_node, NULL,
- "qcom,gpu-pwrlevels");
+ node = of_get_compatible_child(dev->of_node, "qcom,gpu-pwrlevels");
if (!node) {
dev_err(dev, "Could not find the GPU powerlevels\n");
return -ENXIO;
@@ -655,6 +654,8 @@ static int adreno_get_legacy_pwrlevels(struct device *dev)
dev_pm_opp_add(dev, val, 0);
}
+ of_node_put(node);
+
return 0;
}
From: Alek Du <alek.du(a)intel.com>
We observed some premature timeouts on a virtualization platform, the log
is like this:
case 1:
[159525.255629] mmc1: Internal clock never stabilised.
[159525.255818] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[159525.256049] mmc1: sdhci: Sys addr: 0x00000000 | Version: 0x00001002
...
[159525.257205] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x0000fa03
>From the clock control register dump, we are pretty sure the clock was
stablized.
case 2:
[ 914.550127] mmc1: Reset 0x2 never completed.
[ 914.550321] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 914.550608] mmc1: sdhci: Sys addr: 0x00000010 | Version: 0x00001002
After checking the sdhci code, we found the timeout check actually has a
little window that the CPU can be scheduled out and when it comes back,
the original time set or check is not valid.
Fixes: 5a436cc0af62 ("mmc: sdhci: Optimize delay loops")
Cc: stable(a)vger.kernel.org # v4.12+
Signed-off-by: Alek Du <alek.du(a)intel.com>
---
drivers/mmc/host/sdhci.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 99bdae53fa2e..451b08a818a9 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -216,8 +216,12 @@ void sdhci_reset(struct sdhci_host *host, u8 mask)
timeout = ktime_add_ms(ktime_get(), 100);
/* hw clears the bit when it's done */
- while (sdhci_readb(host, SDHCI_SOFTWARE_RESET) & mask) {
- if (ktime_after(ktime_get(), timeout)) {
+ while (1) {
+ bool timedout = ktime_after(ktime_get(), timeout);
+
+ if (!(sdhci_readb(host, SDHCI_SOFTWARE_RESET) & mask))
+ break;
+ if (timedout) {
pr_err("%s: Reset 0x%x never completed.\n",
mmc_hostname(host->mmc), (int)mask);
sdhci_dumpregs(host);
@@ -1608,9 +1612,13 @@ void sdhci_enable_clk(struct sdhci_host *host, u16 clk)
/* Wait max 20 ms */
timeout = ktime_add_ms(ktime_get(), 20);
- while (!((clk = sdhci_readw(host, SDHCI_CLOCK_CONTROL))
- & SDHCI_CLOCK_INT_STABLE)) {
- if (ktime_after(ktime_get(), timeout)) {
+ while (1) {
+ bool timedout = ktime_after(ktime_get(), timeout);
+
+ clk = sdhci_readw(host, SDHCI_CLOCK_CONTROL);
+ if (clk & SDHCI_CLOCK_INT_STABLE)
+ break;
+ if (timedout) {
pr_err("%s: Internal clock never stabilised.\n",
mmc_hostname(host->mmc));
sdhci_dumpregs(host);
--
2.17.1
From: Markus Hofstaetter <markus.hofstaetter(a)ait.ac.at>
commit f16703360da7731a057df2ffa902306819c22398 upstream.
Some PWMs are disabled by default or the default pin setting
does not match the LED_OFF state (e.g., active-low leds).
Hence, the driver may end up reporting 0 brightness, but
the leds are actually on using full brightness, because
it never enforces its default configuration.
So enforce it by calling led_pwm_set() after successfully
registering the device.
Tested on a Phytec phyFLEX i.MX6Q board based on kernel
v3.19.5.
Signed-off-by: Markus Hofstaetter <markus.hofstaetter(a)ait.ac.at>
Tested-by: Markus Hofstaetter <markus.hofstaetter(a)ait.ac.at>
Signed-off-by: Jacek Anaszewski <j.anaszewski(a)samsung.com>
Signed-off-by: Krzysztof Kozlowski <krzk(a)kernel.org>
---
drivers/leds/leds-pwm.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/leds/leds-pwm.c b/drivers/leds/leds-pwm.c
index 1d07e3e83d29..3149dbece146 100644
--- a/drivers/leds/leds-pwm.c
+++ b/drivers/leds/leds-pwm.c
@@ -132,6 +132,7 @@ static int led_pwm_add(struct device *dev, struct led_pwm_priv *priv,
ret = led_classdev_register(dev, &led_data->cdev);
if (ret == 0) {
priv->num_leds++;
+ led_pwm_set(&led_data->cdev, led_data->cdev.brightness);
} else {
dev_err(dev, "failed to register PWM led for %s: %d\n",
led->name, ret);
--
2.7.4