This is a note to let you know that I've just added the patch titled
rtc-opal: Fix handling of firmware error codes, prevent busy loops
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
rtc-opal-fix-handling-of-firmware-error-codes-prevent-busy-loops.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5b8b58063029f02da573120ef4dc9079822e3cda Mon Sep 17 00:00:00 2001
From: Stewart Smith <stewart(a)linux.vnet.ibm.com>
Date: Tue, 2 Aug 2016 11:50:16 +1000
Subject: rtc-opal: Fix handling of firmware error codes, prevent busy loops
From: Stewart Smith <stewart(a)linux.vnet.ibm.com>
commit 5b8b58063029f02da573120ef4dc9079822e3cda upstream.
According to the OPAL docs:
skiboot-5.2.5/doc/opal-api/opal-rtc-read-3.txt
skiboot-5.2.5/doc/opal-api/opal-rtc-write-4.txt
OPAL_HARDWARE may be returned from OPAL_RTC_READ or OPAL_RTC_WRITE and
this indicates either a transient or permanent error.
Prior to this patch, Linux was not dealing with OPAL_HARDWARE being a
permanent error particularly well, in that you could end up in a busy
loop.
This was not too hard to trigger on an AMI BMC based OpenPOWER machine
doing a continuous "ipmitool mc reset cold" to the BMC, the result of
that being that we'd get stuck in an infinite loop in
opal_get_rtc_time().
We now retry a few times before returning the error higher up the
stack.
Fixes: 16b1d26e77b1 ("rtc/tpo: Driver to support rtc and wakeup on PowerNV platform")
Cc: stable(a)vger.kernel.org # v3.19+
Signed-off-by: Stewart Smith <stewart(a)linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/rtc/rtc-opal.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
--- a/drivers/rtc/rtc-opal.c
+++ b/drivers/rtc/rtc-opal.c
@@ -58,6 +58,7 @@ static void tm_to_opal(struct rtc_time *
static int opal_get_rtc_time(struct device *dev, struct rtc_time *tm)
{
long rc = OPAL_BUSY;
+ int retries = 10;
u32 y_m_d;
u64 h_m_s_ms;
__be32 __y_m_d;
@@ -67,8 +68,11 @@ static int opal_get_rtc_time(struct devi
rc = opal_rtc_read(&__y_m_d, &__h_m_s_ms);
if (rc == OPAL_BUSY_EVENT)
opal_poll_events(NULL);
- else
+ else if (retries-- && (rc == OPAL_HARDWARE
+ || rc == OPAL_INTERNAL_ERROR))
msleep(10);
+ else if (rc != OPAL_BUSY && rc != OPAL_BUSY_EVENT)
+ break;
}
if (rc != OPAL_SUCCESS)
@@ -84,6 +88,7 @@ static int opal_get_rtc_time(struct devi
static int opal_set_rtc_time(struct device *dev, struct rtc_time *tm)
{
long rc = OPAL_BUSY;
+ int retries = 10;
u32 y_m_d = 0;
u64 h_m_s_ms = 0;
@@ -92,8 +97,11 @@ static int opal_set_rtc_time(struct devi
rc = opal_rtc_write(y_m_d, h_m_s_ms);
if (rc == OPAL_BUSY_EVENT)
opal_poll_events(NULL);
- else
+ else if (retries-- && (rc == OPAL_HARDWARE
+ || rc == OPAL_INTERNAL_ERROR))
msleep(10);
+ else if (rc != OPAL_BUSY && rc != OPAL_BUSY_EVENT)
+ break;
}
return rc == OPAL_SUCCESS ? 0 : -EIO;
Patches currently in stable-queue which might be from stewart(a)linux.vnet.ibm.com are
queue-4.4/rtc-opal-fix-handling-of-firmware-error-codes-prevent-busy-loops.patch
This is a note to let you know that I've just added the patch titled
mm: hide a #warning for COMPILE_TEST
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-hide-a-warning-for-compile_test.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From af27d9403f5b80685b79c88425086edccecaf711 Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Fri, 16 Feb 2018 16:25:53 +0100
Subject: mm: hide a #warning for COMPILE_TEST
From: Arnd Bergmann <arnd(a)arndb.de>
commit af27d9403f5b80685b79c88425086edccecaf711 upstream.
We get a warning about some slow configurations in randconfig kernels:
mm/memory.c:83:2: error: #warning Unfortunate NUMA and NUMA Balancing config, growing page-frame for last_cpupid. [-Werror=cpp]
The warning is reasonable by itself, but gets in the way of randconfig
build testing, so I'm hiding it whenever CONFIG_COMPILE_TEST is set.
The warning was added in 2013 in commit 75980e97dacc ("mm: fold
page->_last_nid into page->flags where possible").
Cc: stable(a)vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -72,7 +72,7 @@
#include "internal.h"
-#ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
+#if defined(LAST_CPUPID_NOT_IN_PAGE_FLAGS) && !defined(CONFIG_COMPILE_TEST)
#warning Unfortunate NUMA and NUMA Balancing config, growing page-frame for last_cpupid.
#endif
Patches currently in stable-queue which might be from arnd(a)arndb.de are
queue-4.4/arm-spear600-add-missing-interrupt-parent-of-rtc.patch
queue-4.4/arm-dts-sti-add-gpio-polarity-for-hdmi-hpd-gpio-property.patch
queue-4.4/arm-spear13xx-fix-spics-gpio-controller-s-warning.patch
queue-4.4/mm-hide-a-warning-for-compile_test.patch
queue-4.4/arm-spear13xx-fix-dmas-cells.patch
This is a note to let you know that I've just added the patch titled
ext4: save error to disk in __ext4_grp_locked_error()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ext4-save-error-to-disk-in-__ext4_grp_locked_error.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 06f29cc81f0350261f59643a505010531130eea0 Mon Sep 17 00:00:00 2001
From: Zhouyi Zhou <zhouzhouyi(a)gmail.com>
Date: Wed, 10 Jan 2018 00:34:19 -0500
Subject: ext4: save error to disk in __ext4_grp_locked_error()
From: Zhouyi Zhou <zhouzhouyi(a)gmail.com>
commit 06f29cc81f0350261f59643a505010531130eea0 upstream.
In the function __ext4_grp_locked_error(), __save_error_info()
is called to save error info in super block block, but does not sync
that information to disk to info the subsequence fsck after reboot.
This patch writes the error information to disk. After this patch,
I think there is no obvious EXT4 error handle branches which leads to
"Remounting filesystem read-only" will leave the disk partition miss
the subsequence fsck.
Signed-off-by: Zhouyi Zhou <zhouzhouyi(a)gmail.com>
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: stable(a)vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/ext4/super.c | 1 +
1 file changed, 1 insertion(+)
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -688,6 +688,7 @@ __acquires(bitlock)
}
ext4_unlock_group(sb, grp);
+ ext4_commit_super(sb, 1);
ext4_handle_error(sb);
/*
* We only get here in the ERRORS_RO case; relocking the group
Patches currently in stable-queue which might be from zhouzhouyi(a)gmail.com are
queue-4.4/ext4-save-error-to-disk-in-__ext4_grp_locked_error.patch
This is a note to let you know that I've just added the patch titled
drm/radeon: adjust tested variable
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-radeon-adjust-tested-variable.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3a61b527b4e1f285d21b6e9e623dc45cf8bb391f Mon Sep 17 00:00:00 2001
From: Julia Lawall <Julia.Lawall(a)lip6.fr>
Date: Sat, 27 Jan 2018 15:28:15 +0100
Subject: drm/radeon: adjust tested variable
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Julia Lawall <Julia.Lawall(a)lip6.fr>
commit 3a61b527b4e1f285d21b6e9e623dc45cf8bb391f upstream.
Check the variable that was most recently initialized.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression x, y, f, g, e, m;
statement S1,S2,S3,S4;
@@
x = f(...);
if (\(<+...x...+>\&e\)) S1 else S2
(
x = g(...);
|
m = g(...,&x,...);
|
y = g(...);
*if (e)
S3 else S4
)
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall(a)lip6.fr>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/radeon/radeon_uvd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/radeon/radeon_uvd.c
+++ b/drivers/gpu/drm/radeon/radeon_uvd.c
@@ -946,7 +946,7 @@ int radeon_uvd_calc_upll_dividers(struct
/* calc dclk divider with current vco freq */
dclk_div = radeon_uvd_calc_upll_post_div(vco_freq, dclk,
pd_min, pd_even);
- if (vclk_div > pd_max)
+ if (dclk_div > pd_max)
break; /* vco is too big, it has to stop */
/* calc score with current vco freq */
Patches currently in stable-queue which might be from Julia.Lawall(a)lip6.fr are
queue-4.4/drm-radeon-adjust-tested-variable.patch
This is a note to let you know that I've just added the patch titled
ext4: correct documentation for grpid mount option
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ext4-correct-documentation-for-grpid-mount-option.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 9f0372488cc9243018a812e8cfbf27de650b187b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ernesto=20A=2E=20Fern=C3=A1ndez?=
<ernesto.mnd.fernandez(a)gmail.com>
Date: Thu, 11 Jan 2018 13:43:33 -0500
Subject: ext4: correct documentation for grpid mount option
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Ernesto A. Fernández <ernesto.mnd.fernandez(a)gmail.com>
commit 9f0372488cc9243018a812e8cfbf27de650b187b upstream.
The grpid option is currently described as being the same as nogrpid.
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez(a)gmail.com>
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: stable(a)vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Documentation/filesystems/ext4.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/Documentation/filesystems/ext4.txt
+++ b/Documentation/filesystems/ext4.txt
@@ -233,7 +233,7 @@ data_err=ignore(*) Just print an error m
data_err=abort Abort the journal if an error occurs in a file
data buffer in ordered mode.
-grpid Give objects the same group ID as their creator.
+grpid New objects have the group ID of their parent.
bsdgroups
nogrpid (*) New objects have the group ID of their creator.
Patches currently in stable-queue which might be from ernesto.mnd.fernandez(a)gmail.com are
queue-4.4/ext4-correct-documentation-for-grpid-mount-option.patch
This is a note to let you know that I've just added the patch titled
console/dummy: leave .con_font_get set to NULL
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
console-dummy-leave-.con_font_get-set-to-null.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 724ba8b30b044aa0d94b1cd374fc15806cdd6f18 Mon Sep 17 00:00:00 2001
From: Nicolas Pitre <nicolas.pitre(a)linaro.org>
Date: Mon, 15 Jan 2018 17:04:22 +0100
Subject: console/dummy: leave .con_font_get set to NULL
From: Nicolas Pitre <nicolas.pitre(a)linaro.org>
commit 724ba8b30b044aa0d94b1cd374fc15806cdd6f18 upstream.
When this method is set, the caller expects struct console_font fields
to be properly initialized when it returns. Leave it unset otherwise
nonsensical (leaked kernel stack) values are returned to user space.
Signed-off-by: Nicolas Pitre <nico(a)linaro.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie(a)samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/video/console/dummycon.c | 1 -
1 file changed, 1 deletion(-)
--- a/drivers/video/console/dummycon.c
+++ b/drivers/video/console/dummycon.c
@@ -68,7 +68,6 @@ const struct consw dummy_con = {
.con_switch = DUMMY,
.con_blank = DUMMY,
.con_font_set = DUMMY,
- .con_font_get = DUMMY,
.con_font_default = DUMMY,
.con_font_copy = DUMMY,
.con_set_palette = DUMMY,
Patches currently in stable-queue which might be from nicolas.pitre(a)linaro.org are
queue-4.4/console-dummy-leave-.con_font_get-set-to-null.patch
This is a note to let you know that I've just added the patch titled
xenbus: track caller request id
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
xenbus-track-caller-request-id.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 29fee6eed2811ff1089b30fc579a2d19d78016ab Mon Sep 17 00:00:00 2001
From: Joao Martins <joao.m.martins(a)oracle.com>
Date: Fri, 2 Feb 2018 17:42:33 +0000
Subject: xenbus: track caller request id
From: Joao Martins <joao.m.martins(a)oracle.com>
commit 29fee6eed2811ff1089b30fc579a2d19d78016ab upstream.
Commit fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent
xenstore accesses") optimized xenbus concurrent accesses but in doing so
broke UABI of /dev/xen/xenbus. Through /dev/xen/xenbus applications are in
charge of xenbus message exchange with the correct header and body. Now,
after the mentioned commit the replies received by application will no
longer have the header req_id echoed back as it was on request (see
specification below for reference), because that particular field is being
overwritten by kernel.
struct xsd_sockmsg
{
uint32_t type; /* XS_??? */
uint32_t req_id;/* Request identifier, echoed in daemon's response. */
uint32_t tx_id; /* Transaction id (0 if not related to a transaction). */
uint32_t len; /* Length of data following this. */
/* Generally followed by nul-terminated string(s). */
};
Before there was only one request at a time so req_id could simply be
forwarded back and forth. To allow simultaneous requests we need a
different req_id for each message thus kernel keeps a monotonic increasing
counter for this field and is written on every request irrespective of
userspace value.
Forwarding again the req_id on userspace requests is not a solution because
we would open the possibility of userspace-generated req_id colliding with
kernel ones. So this patch instead takes another route which is to
artificially keep user req_id while keeping the xenbus logic as is. We do
that by saving the original req_id before xs_send(), use the private kernel
counter as req_id and then once reply comes and was validated, we restore
back the original req_id.
Cc: <stable(a)vger.kernel.org> # 4.11
Fixes: fd8aa9095a ("xen: optimize xenbus driver for multiple concurrent xenstore accesses")
Reported-by: Bhavesh Davda <bhavesh.davda(a)oracle.com>
Signed-off-by: Joao Martins <joao.m.martins(a)oracle.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/xen/xenbus/xenbus.h | 1 +
drivers/xen/xenbus/xenbus_comms.c | 1 +
drivers/xen/xenbus/xenbus_xs.c | 3 +++
3 files changed, 5 insertions(+)
--- a/drivers/xen/xenbus/xenbus.h
+++ b/drivers/xen/xenbus/xenbus.h
@@ -76,6 +76,7 @@ struct xb_req_data {
struct list_head list;
wait_queue_head_t wq;
struct xsd_sockmsg msg;
+ uint32_t caller_req_id;
enum xsd_sockmsg_type type;
char *body;
const struct kvec *vec;
--- a/drivers/xen/xenbus/xenbus_comms.c
+++ b/drivers/xen/xenbus/xenbus_comms.c
@@ -309,6 +309,7 @@ static int process_msg(void)
goto out;
if (req->state == xb_req_state_wait_reply) {
+ req->msg.req_id = req->caller_req_id;
req->msg.type = state.msg.type;
req->msg.len = state.msg.len;
req->body = state.body;
--- a/drivers/xen/xenbus/xenbus_xs.c
+++ b/drivers/xen/xenbus/xenbus_xs.c
@@ -227,6 +227,8 @@ static void xs_send(struct xb_req_data *
req->state = xb_req_state_queued;
init_waitqueue_head(&req->wq);
+ /* Save the caller req_id and restore it later in the reply */
+ req->caller_req_id = req->msg.req_id;
req->msg.req_id = xs_request_enter(req);
mutex_lock(&xb_write_mutex);
@@ -310,6 +312,7 @@ static void *xs_talkv(struct xenbus_tran
req->num_vecs = num_vecs;
req->cb = xs_wake_up;
+ msg.req_id = 0;
msg.tx_id = t.id;
msg.type = type;
msg.len = 0;
Patches currently in stable-queue which might be from joao.m.martins(a)oracle.com are
queue-4.15/xenbus-track-caller-request-id.patch
This is a note to let you know that I've just added the patch titled
xen: Fix {set,clear}_foreign_p2m_mapping on autotranslating guests
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
xen-fix-set-clear-_foreign_p2m_mapping-on-autotranslating-guests.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 781198f1f373c3e350dbeb3af04a7d4c81c1b8d7 Mon Sep 17 00:00:00 2001
From: Simon Gaiser <simon(a)invisiblethingslab.com>
Date: Wed, 7 Feb 2018 21:47:40 +0100
Subject: xen: Fix {set,clear}_foreign_p2m_mapping on autotranslating guests
From: Simon Gaiser <simon(a)invisiblethingslab.com>
commit 781198f1f373c3e350dbeb3af04a7d4c81c1b8d7 upstream.
Commit 82616f9599a7 ("xen: remove tests for pvh mode in pure pv paths")
removed the check for autotranslation from {set,clear}_foreign_p2m_mapping
but those are called by grant-table.c also on PVH/HVM guests.
Cc: <stable(a)vger.kernel.org> # 4.14
Fixes: 82616f9599a7 ("xen: remove tests for pvh mode in pure pv paths")
Signed-off-by: Simon Gaiser <simon(a)invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/xen/p2m.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/arch/x86/xen/p2m.c
+++ b/arch/x86/xen/p2m.c
@@ -694,6 +694,9 @@ int set_foreign_p2m_mapping(struct gntta
int i, ret = 0;
pte_t *pte;
+ if (xen_feature(XENFEAT_auto_translated_physmap))
+ return 0;
+
if (kmap_ops) {
ret = HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref,
kmap_ops, count);
@@ -736,6 +739,9 @@ int clear_foreign_p2m_mapping(struct gnt
{
int i, ret = 0;
+ if (xen_feature(XENFEAT_auto_translated_physmap))
+ return 0;
+
for (i = 0; i < count; i++) {
unsigned long mfn = __pfn_to_mfn(page_to_pfn(pages[i]));
unsigned long pfn = page_to_pfn(pages[i]);
Patches currently in stable-queue which might be from simon(a)invisiblethingslab.com are
queue-4.15/xen-fix-set-clear-_foreign_p2m_mapping-on-autotranslating-guests.patch
This is a note to let you know that I've just added the patch titled
x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-smpboot-fix-uncore_pci_remove-indexing-bug-when-hot-removing-a-physical-cpu.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 295cc7eb314eb3321fb6d67ca6f7305f5c50d10f Mon Sep 17 00:00:00 2001
From: Masayoshi Mizuma <m.mizuma(a)jp.fujitsu.com>
Date: Thu, 8 Feb 2018 09:19:08 -0500
Subject: x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU
From: Masayoshi Mizuma <m.mizuma(a)jp.fujitsu.com>
commit 295cc7eb314eb3321fb6d67ca6f7305f5c50d10f upstream.
When a physical CPU is hot-removed, the following warning messages
are shown while the uncore device is removed in uncore_pci_remove():
WARNING: CPU: 120 PID: 5 at arch/x86/events/intel/uncore.c:988
uncore_pci_remove+0xf1/0x110
...
CPU: 120 PID: 5 Comm: kworker/u1024:0 Not tainted 4.15.0-rc8 #1
Workqueue: kacpi_hotplug acpi_hotplug_work_fn
...
Call Trace:
pci_device_remove+0x36/0xb0
device_release_driver_internal+0x145/0x210
pci_stop_bus_device+0x76/0xa0
pci_stop_root_bus+0x44/0x60
acpi_pci_root_remove+0x1f/0x80
acpi_bus_trim+0x54/0x90
acpi_bus_trim+0x2e/0x90
acpi_device_hotplug+0x2bc/0x4b0
acpi_hotplug_work_fn+0x1a/0x30
process_one_work+0x141/0x340
worker_thread+0x47/0x3e0
kthread+0xf5/0x130
When uncore_pci_remove() runs, it tries to get the package ID to
clear the value of uncore_extra_pci_dev[].dev[] by using
topology_phys_to_logical_pkg(). The warning messesages are
shown because topology_phys_to_logical_pkg() returns -1.
arch/x86/events/intel/uncore.c:
static void uncore_pci_remove(struct pci_dev *pdev)
{
...
phys_id = uncore_pcibus_to_physid(pdev->bus);
...
pkg = topology_phys_to_logical_pkg(phys_id); // returns -1
for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) {
if (uncore_extra_pci_dev[pkg].dev[i] == pdev) {
uncore_extra_pci_dev[pkg].dev[i] = NULL;
break;
}
}
WARN_ON_ONCE(i >= UNCORE_EXTRA_PCI_DEV_MAX); // <=========== HERE!!
topology_phys_to_logical_pkg() tries to find
cpuinfo_x86->phys_proc_id that matches the phys_pkg argument.
arch/x86/kernel/smpboot.c:
int topology_phys_to_logical_pkg(unsigned int phys_pkg)
{
int cpu;
for_each_possible_cpu(cpu) {
struct cpuinfo_x86 *c = &cpu_data(cpu);
if (c->initialized && c->phys_proc_id == phys_pkg)
return c->logical_proc_id;
}
return -1;
}
However, the phys_proc_id was already set to 0 by remove_siblinginfo()
when the CPU was offlined.
So, topology_phys_to_logical_pkg() cannot find the correct
logical_proc_id and always returns -1.
As the result, uncore_pci_remove() calls WARN_ON_ONCE() and the warning
messages are shown.
What is worse is that the bogus 'pkg' index results in two bugs:
- We dereference uncore_extra_pci_dev[] with a negative index
- We fail to clean up a stale pointer in uncore_extra_pci_dev[][]
To fix these bugs, remove the clearing of ->phys_proc_id from remove_siblinginfo().
This should not cause any problems, because ->phys_proc_id is not
used after it is hot-removed and it is re-set while hot-adding.
Signed-off-by: Masayoshi Mizuma <m.mizuma(a)jp.fujitsu.com>
Acked-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: yasu.isimatu(a)gmail.com
Cc: <stable(a)vger.kernel.org>
Fixes: 30bb9811856f ("x86/topology: Avoid wasting 128k for package id array")
Link: http://lkml.kernel.org/r/ed738d54-0f01-b38b-b794-c31dc118c207@gmail.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/smpboot.c | 1 -
1 file changed, 1 deletion(-)
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1431,7 +1431,6 @@ static void remove_siblinginfo(int cpu)
cpumask_clear(cpu_llc_shared_mask(cpu));
cpumask_clear(topology_sibling_cpumask(cpu));
cpumask_clear(topology_core_cpumask(cpu));
- c->phys_proc_id = 0;
c->cpu_core_id = 0;
cpumask_clear_cpu(cpu, cpu_sibling_setup_mask);
recompute_smt_state();
Patches currently in stable-queue which might be from m.mizuma(a)jp.fujitsu.com are
queue-4.15/x86-smpboot-fix-uncore_pci_remove-indexing-bug-when-hot-removing-a-physical-cpu.patch
This is a note to let you know that I've just added the patch titled
video: fbdev: atmel_lcdfb: fix display-timings lookup
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
video-fbdev-atmel_lcdfb-fix-display-timings-lookup.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 9cb18db0701f6b74f0c45c23ad767b3ebebe37f6 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Fri, 29 Dec 2017 19:48:43 +0100
Subject: video: fbdev: atmel_lcdfb: fix display-timings lookup
From: Johan Hovold <johan(a)kernel.org>
commit 9cb18db0701f6b74f0c45c23ad767b3ebebe37f6 upstream.
Fix child-node lookup during probe, which ended up searching the whole
device tree depth-first starting at the parent rather than just matching
on its children.
To make things worse, the parent display node was also prematurely
freed.
Note that the display and timings node references are never put after a
successful dt-initialisation so the nodes would leak on later probe
deferrals and on driver unbind.
Fixes: b985172b328a ("video: atmel_lcdfb: add device tree suport")
Cc: stable <stable(a)vger.kernel.org> # 3.13
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj(a)jcrosoft.com>
Cc: Nicolas Ferre <nicolas.ferre(a)microchip.com>
Cc: Alexandre Belloni <alexandre.belloni(a)free-electrons.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie(a)samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/video/fbdev/atmel_lcdfb.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/video/fbdev/atmel_lcdfb.c
+++ b/drivers/video/fbdev/atmel_lcdfb.c
@@ -1119,7 +1119,7 @@ static int atmel_lcdfb_of_init(struct at
goto put_display_node;
}
- timings_np = of_find_node_by_name(display_np, "display-timings");
+ timings_np = of_get_child_by_name(display_np, "display-timings");
if (!timings_np) {
dev_err(dev, "failed to find display-timings node\n");
ret = -ENODEV;
@@ -1140,6 +1140,12 @@ static int atmel_lcdfb_of_init(struct at
fb_add_videomode(&fb_vm, &info->modelist);
}
+ /*
+ * FIXME: Make sure we are not referencing any fields in display_np
+ * and timings_np and drop our references to them before returning to
+ * avoid leaking the nodes on probe deferral and driver unbind.
+ */
+
return 0;
put_timings_node:
Patches currently in stable-queue which might be from johan(a)kernel.org are
queue-4.15/pci-keystone-fix-interrupt-controller-node-lookup.patch
queue-4.15/video-fbdev-atmel_lcdfb-fix-display-timings-lookup.patch