This patch series includes some improvement to Machine check handler
for pseries. Patch 1 fixes a buffer overrun issue if rtas extended error
log size is greater than RTAS_ERROR_LOG_MAX.
Patch 2 fixes an issue where machine check handler crashes
kernel while accessing vmalloc-ed buffer while in nmi context.
Patch 3 fixes endain bug while restoring of r3 in MCE handler.
Patch 5 implements a real mode mce handler and flushes the SLBs on SLB error.
Patch 6 display's the MCE error details on console.
Patch 7 saves and dumps the SLB contents on SLB MCE errors to improve the
debugability.
Change in V5:
- Use min_t instead of max_t.
- Fix an issue reported by kbuild test robot and addressed review comments.
Change in V4:
- Flush the SLBs in real mode mce handler to handle SLB errors for entry 0.
- Allocate buffers per cpu to hold rtas error log and old slb contents.
- Defer the logging of rtas error log to irq work queue.
Change in V3:
- Moved patch 5 to patch 2
Change in V2:
- patch 3: Display additional info (NIP and task info) in MCE error details.
- patch 5: Fix endain bug while restoring of r3 in MCE handler.
---
Mahesh Salgaonkar (7):
powerpc/pseries: Avoid using the size greater than
powerpc/pseries: Defer the logging of rtas error to irq work queue.
powerpc/pseries: Fix endainness while restoring of r3 in MCE handler.
powerpc/pseries: Define MCE error event section.
powerpc/pseries: flush SLB contents on SLB MCE errors.
powerpc/pseries: Display machine check error details.
powerpc/pseries: Dump the SLB contents on SLB MCE errors.
arch/powerpc/include/asm/book3s/64/mmu-hash.h | 8 +
arch/powerpc/include/asm/machdep.h | 1
arch/powerpc/include/asm/paca.h | 4
arch/powerpc/include/asm/rtas.h | 116 ++++++++++++
arch/powerpc/kernel/exceptions-64s.S | 42 ++++
arch/powerpc/kernel/mce.c | 16 +-
arch/powerpc/mm/slb.c | 63 +++++++
arch/powerpc/platforms/powernv/opal.c | 1
arch/powerpc/platforms/pseries/pseries.h | 1
arch/powerpc/platforms/pseries/ras.c | 241 +++++++++++++++++++++++--
arch/powerpc/platforms/pseries/setup.c | 27 +++
11 files changed, 499 insertions(+), 21 deletions(-)
--
Signature
Embarrassingly, the recent fix introduced worse problem than it solved,
causing the balloon not to inflate. The VM informed the hypervisor that
the pages for lock/unlock are sitting in the wrong address, as it used
the page that is used the uninitialized page variable.
Fixes: b23220fe054e9 ("vmw_balloon: fixing double free when batching mode is off")
Cc: stable(a)vger.kernel.org
Reviewed-by: Xavier Deguillard <xdeguillard(a)vmware.com>
Signed-off-by: Nadav Amit <namit(a)vmware.com>
---
drivers/misc/vmw_balloon.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index efd733472a35..56c6f79a5c5a 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -467,7 +467,7 @@ static int vmballoon_send_batched_lock(struct vmballoon *b,
unsigned int num_pages, bool is_2m_pages, unsigned int *target)
{
unsigned long status;
- unsigned long pfn = page_to_pfn(b->page);
+ unsigned long pfn = PHYS_PFN(virt_to_phys(b->batch_page));
STATS_INC(b->stats.lock[is_2m_pages]);
@@ -515,7 +515,7 @@ static bool vmballoon_send_batched_unlock(struct vmballoon *b,
unsigned int num_pages, bool is_2m_pages, unsigned int *target)
{
unsigned long status;
- unsigned long pfn = page_to_pfn(b->page);
+ unsigned long pfn = PHYS_PFN(virt_to_phys(b->batch_page));
STATS_INC(b->stats.unlock[is_2m_pages]);
--
2.17.1
The patch titled
Subject: mm: teach dump_page() to correctly output poisoned struct pages
has been added to the -mm tree. Its filename is
mm-teach-dump_page-to-correctly-output-poisoned-struct-pages.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-teach-dump_page-to-correctly-ou…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-teach-dump_page-to-correctly-ou…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Subject: mm: teach dump_page() to correctly output poisoned struct pages
If struct page is poisoned, and uninitialized access is detected via
PF_POISONED_CHECK(page) dump_page() is called to output the page. But,
the dump_page() itself accesses struct page to determine how to print it,
and therefore gets into a recursive loop.
For example:
dump_page()
__dump_page()
PageSlab(page)
PF_POISONED_CHECK(page)
VM_BUG_ON_PGFLAGS(PagePoisoned(page), page)
dump_page() recursion loop.
Link: http://lkml.kernel.org/r/20180702180536.2552-1-pasha.tatashin@oracle.com
Fixes: f165b378bbdf ("mm: uninitialized struct page poisoning sanity checking")
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/debug.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff -puN mm/debug.c~mm-teach-dump_page-to-correctly-output-poisoned-struct-pages mm/debug.c
--- a/mm/debug.c~mm-teach-dump_page-to-correctly-output-poisoned-struct-pages
+++ a/mm/debug.c
@@ -43,12 +43,25 @@ const struct trace_print_flags vmaflag_n
void __dump_page(struct page *page, const char *reason)
{
+ bool page_poisoned = PagePoisoned(page);
+ int mapcount;
+
+ /*
+ * If struct page is poisoned don't access Page*() functions as that
+ * leads to recursive loop. Page*() check for poisoned pages, and calls
+ * dump_page() when detected.
+ */
+ if (page_poisoned) {
+ pr_emerg("page:%px is uninitialized and poisoned", page);
+ goto hex_only;
+ }
+
/*
* Avoid VM_BUG_ON() in page_mapcount().
* page->_mapcount space in struct page is used by sl[aou]b pages to
* encode own info.
*/
- int mapcount = PageSlab(page) ? 0 : page_mapcount(page);
+ mapcount = PageSlab(page) ? 0 : page_mapcount(page);
pr_emerg("page:%px count:%d mapcount:%d mapping:%px index:%#lx",
page, page_ref_count(page), mapcount,
@@ -60,6 +73,7 @@ void __dump_page(struct page *page, cons
pr_emerg("flags: %#lx(%pGp)\n", page->flags, &page->flags);
+hex_only:
print_hex_dump(KERN_ALERT, "raw: ", DUMP_PREFIX_NONE, 32,
sizeof(unsigned long), page,
sizeof(struct page), false);
@@ -68,7 +82,7 @@ void __dump_page(struct page *page, cons
pr_alert("page dumped because: %s\n", reason);
#ifdef CONFIG_MEMCG
- if (page->mem_cgroup)
+ if (!page_poisoned && page->mem_cgroup)
pr_alert("page->mem_cgroup:%px\n", page->mem_cgroup);
#endif
}
_
Patches currently in -mm which might be from pasha.tatashin(a)oracle.com are
mm-teach-dump_page-to-correctly-output-poisoned-struct-pages.patch
mm-skip-invalid-pages-block-at-a-time-in-zero_resv_unresv.patch
sparc64-ng4-memset-32-bits-overflow.patch
Currently, on the AMD board Asus F2A85-M Pro there is a 100 ms delay as
the USB bus of each of the two OHCI PCI devices is reset. As a 50 ms
delay is done per the USB specification.
Commit c6187597 (OHCI: final fix for NVIDIA problems (I hope))
unconditionally does the bus reset for
all chipsets, while it was only doen for NVIDIA chipsets before.
As it should not be needed for non-NVIDIA chipsets, only do the reset
for Nvidia devices.
Tested on Asus F2A85-M PRO and ASRock E350M1. The USB keyboard works and
the LUKS passphrase can be e
ntered.
Signed-off-by: Paul Menzel <pmenzel(a)molgen.mpg.de>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: linux-usb(a)vger.kernel.org
Cc: Alan Stern <stern(a)rowland.harvard.edu>
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
drivers/usb/host/pci-quirks.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/host/pci-quirks.c b/drivers/usb/host/pci-quirks.c
index 3625a5c1a41b..f6b1a9bbe301 100644
--- a/drivers/usb/host/pci-quirks.c
+++ b/drivers/usb/host/pci-quirks.c
@@ -784,7 +784,7 @@ static void quirk_usb_handoff_ohci(struct pci_dev *pdev)
writel((u32) ~0, base + OHCI_INTRDISABLE);
/* Reset the USB bus, if the controller isn't already in RESET */
- if (control & OHCI_HCFS) {
+ if ((pdev->vendor == PCI_VENDOR_ID_NVIDIA) && (control & OHCI_HCFS)) {
/* Go into RESET, preserving RWC (and possibly IR) */
writel(control & OHCI_CTRL_MASK, base + OHCI_CONTROL);
readl(base + OHCI_CONTROL);
--
2.17.1