From: Xunlei Pang <xlpang(a)redhat.com>
On some of our systems, we notice this error popping up on occasion,
completely hanging the system.
[<ffffffc0000ee398>] enqueue_task_dl+0x1f0/0x420
[<ffffffc0000d0f14>] activate_task+0x7c/0x90
[<ffffffc0000edbdc>] push_dl_task+0x164/0x1c8
[<ffffffc0000edc60>] push_dl_tasks+0x20/0x30
[<ffffffc0000cc00c>] __balance_callback+0x44/0x68
[<ffffffc000d2c018>] __schedule+0x6f0/0x728
[<ffffffc000d2c278>] schedule+0x78/0x98
[<ffffffc000d2e76c>] __rt_mutex_slowlock+0x9c/0x108
[<ffffffc000d2e9d0>] rt_mutex_slowlock+0xd8/0x198
[<ffffffc0000f7f28>] rt_mutex_timed_futex_lock+0x30/0x40
[<ffffffc00012c1a8>] futex_lock_pi+0x200/0x3b0
[<ffffffc00012cf84>] do_futex+0x1c4/0x550
It runs an 4.4 kernel on an arm64 rig. The signature looks suspciously
similar to what Xuneli Pang observed in his crash, and with this fix, my
issue goes away (my system has survivied approx 1500 reboots and a few
nasty tests so far)
Alongside this patch in the tree, there are a few other bits and pieces
pertaining to futex, rtmutex and kernel/sched/, but those patches
creates
weird crashes that I have not been able to dissect yet. Once (if) I have
been able to figure those out (and test), they will be sent later.
I am sure other users of LTS that also use sched_deadline will run into
this issue, so I think it is a good candidate for 4.4-stable. Possibly
also
to 4.9 and 4.14, but I have not had time to test for those versions.
Apart from a minor conflict in sched.h, the patch applied cleanly.
(Tested on arm64 running 4.4.<late-ish>)
-Henrik
A crash happened while I was playing with deadline PI rtmutex.
BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
IP: [<ffffffff810eeb8f>] rt_mutex_get_top_task+0x1f/0x30
PGD 232a75067 PUD 230947067 PMD 0
Oops: 0000 [#1] SMP
CPU: 1 PID: 10994 Comm: a.out Not tainted
Call Trace:
[<ffffffff810b658c>] enqueue_task+0x2c/0x80
[<ffffffff810ba763>] activate_task+0x23/0x30
[<ffffffff810d0ab5>] pull_dl_task+0x1d5/0x260
[<ffffffff810d0be6>] pre_schedule_dl+0x16/0x20
[<ffffffff8164e783>] __schedule+0xd3/0x900
[<ffffffff8164efd9>] schedule+0x29/0x70
[<ffffffff8165035b>] __rt_mutex_slowlock+0x4b/0xc0
[<ffffffff81650501>] rt_mutex_slowlock+0xd1/0x190
[<ffffffff810eeb33>] rt_mutex_timed_lock+0x53/0x60
[<ffffffff810ecbfc>] futex_lock_pi.isra.18+0x28c/0x390
[<ffffffff810ed8b0>] do_futex+0x190/0x5b0
[<ffffffff810edd50>] SyS_futex+0x80/0x180
This is because rt_mutex_enqueue_pi() and rt_mutex_dequeue_pi()
are only protected by pi_lock when operating pi waiters, while
rt_mutex_get_top_task(), will access them with rq lock held but
not holding pi_lock.
In order to tackle it, we introduce new "pi_top_task" pointer
cached in task_struct, and add new rt_mutex_update_top_task()
to update its value, it can be called by rt_mutex_setprio()
which held both owner's pi_lock and rq lock. Thus "pi_top_task"
can be safely accessed by enqueue_task_dl() under rq lock.
Originally-From: Peter Zijlstra <peterz(a)infradead.org>
Signed-off-by: Xunlei Pang <xlpang(a)redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Acked-by: Steven Rostedt <rostedt(a)goodmis.org>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: juri.lelli(a)arm.com
Cc: bigeasy(a)linutronix.de
Cc: mathieu.desnoyers(a)efficios.com
Cc: jdesfossez(a)efficios.com
Cc: bristot(a)redhat.com
Link: http://lkml.kernel.org/r/20170323150216.157682758@infradead.org
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
(cherry picked from commit e96a7705e7d3fef96aec9b590c63b2f6f7d2ba22)
Conflicts:
include/linux/sched.h
Backported-and-tested-by: Henrik Austad <haustad(a)cisco.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/init_task.h | 1 +
include/linux/sched.h | 2 ++
include/linux/sched/rt.h | 1 +
kernel/fork.c | 1 +
kernel/locking/rtmutex.c | 29 +++++++++++++++++++++--------
kernel/sched/core.c | 2 ++
6 files changed, 28 insertions(+), 8 deletions(-)
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 1c1ff7e4faa4..a561ce0c5d7f 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -162,6 +162,7 @@ extern struct task_group root_task_group;
#ifdef CONFIG_RT_MUTEXES
# define INIT_RT_MUTEXES(tsk) \
.pi_waiters = RB_ROOT, \
+ .pi_top_task = NULL, \
.pi_waiters_leftmost = NULL,
#else
# define INIT_RT_MUTEXES(tsk)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index a464ba71a993..19a3f946caf0 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1628,6 +1628,8 @@ struct task_struct {
/* PI waiters blocked on a rt_mutex held by this task */
struct rb_root pi_waiters;
struct rb_node *pi_waiters_leftmost;
+ /* Updated under owner's pi_lock and rq lock */
+ struct task_struct *pi_top_task;
/* Deadlock detection and priority inheritance handling */
struct rt_mutex_waiter *pi_blocked_on;
#endif
diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
index a30b172df6e1..60d0c4740b9f 100644
--- a/include/linux/sched/rt.h
+++ b/include/linux/sched/rt.h
@@ -19,6 +19,7 @@ static inline int rt_task(struct task_struct *p)
extern int rt_mutex_getprio(struct task_struct *p);
extern void rt_mutex_setprio(struct task_struct *p, int prio);
extern int rt_mutex_get_effective_prio(struct task_struct *task, int newprio);
+extern void rt_mutex_update_top_task(struct task_struct *p);
extern struct task_struct *rt_mutex_get_top_task(struct task_struct *task);
extern void rt_mutex_adjust_pi(struct task_struct *p);
static inline bool tsk_is_pi_blocked(struct task_struct *tsk)
diff --git a/kernel/fork.c b/kernel/fork.c
index ac00f14208b7..1e5d15defe25 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1240,6 +1240,7 @@ static void rt_mutex_init_task(struct task_struct *p)
#ifdef CONFIG_RT_MUTEXES
p->pi_waiters = RB_ROOT;
p->pi_waiters_leftmost = NULL;
+ p->pi_top_task = NULL;
p->pi_blocked_on = NULL;
#endif
}
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index b066724d7a5b..8ddce9fb6724 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -319,6 +319,19 @@ rt_mutex_dequeue_pi(struct task_struct *task, struct rt_mutex_waiter *waiter)
}
/*
+ * Must hold both p->pi_lock and task_rq(p)->lock.
+ */
+void rt_mutex_update_top_task(struct task_struct *p)
+{
+ if (!task_has_pi_waiters(p)) {
+ p->pi_top_task = NULL;
+ return;
+ }
+
+ p->pi_top_task = task_top_pi_waiter(p)->task;
+}
+
+/*
* Calculate task priority from the waiter tree priority
*
* Return task->normal_prio when the waiter tree is empty or when
@@ -333,12 +346,12 @@ int rt_mutex_getprio(struct task_struct *task)
task->normal_prio);
}
+/*
+ * Must hold either p->pi_lock or task_rq(p)->lock.
+ */
struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
{
- if (likely(!task_has_pi_waiters(task)))
- return NULL;
-
- return task_top_pi_waiter(task)->task;
+ return task->pi_top_task;
}
/*
@@ -347,12 +360,12 @@ struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
*/
int rt_mutex_get_effective_prio(struct task_struct *task, int newprio)
{
- if (!task_has_pi_waiters(task))
+ struct task_struct *top_task = rt_mutex_get_top_task(task);
+
+ if (!top_task)
return newprio;
- if (task_top_pi_waiter(task)->task->prio <= newprio)
- return task_top_pi_waiter(task)->task->prio;
- return newprio;
+ return min(top_task->prio, newprio);
}
/*
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4d1511ad3753..4e23579cc38f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3428,6 +3428,8 @@ void rt_mutex_setprio(struct task_struct *p, int prio)
goto out_unlock;
}
+ rt_mutex_update_top_task(p);
+
trace_sched_pi_setprio(p, prio);
oldprio = p->prio;
prev_class = p->sched_class;
--
2.11.0
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
We deinit the lpe audio device before we call
drm_atomic_helper_shutdown(), which means the platform device
may already be gone when it comes time to shut down the crtc.
As we don't know when the last reference to the platform
device gets dropped by the audio driver we can't assume that
the device and its data are still around when turning off the
crtc. Mark the platform device as gone as soon as we do the
audio deinit.
Cc: stable(a)vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/intel_lpe_audio.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_lpe_audio.c b/drivers/gpu/drm/i915/intel_lpe_audio.c
index cdf19553ffac..5d5336fbe7b0 100644
--- a/drivers/gpu/drm/i915/intel_lpe_audio.c
+++ b/drivers/gpu/drm/i915/intel_lpe_audio.c
@@ -297,8 +297,10 @@ void intel_lpe_audio_teardown(struct drm_i915_private *dev_priv)
lpe_audio_platdev_destroy(dev_priv);
irq_free_desc(dev_priv->lpe_audio.irq);
-}
+ dev_priv->lpe_audio.irq = -1;
+ dev_priv->lpe_audio.platdev = NULL;
+}
/**
* intel_lpe_audio_notify() - notify lpe audio event
--
2.18.1
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: v4l: event: Add subscription to list before calling "add" operation
Author: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Date: Mon Nov 5 09:35:44 2018 -0500
Patch ad608fbcf166 changed how events were subscribed to address an issue
elsewhere. As a side effect of that change, the "add" callback was called
before the event subscription was added to the list of subscribed events,
causing the first event queued by the add callback (and possibly other
events arriving soon afterwards) to be lost.
Fix this by adding the subscription to the list before calling the "add"
callback, and clean up afterwards if that fails.
Fixes: ad608fbcf166 ("media: v4l: event: Prevent freeing event subscriptions while accessed")
Reported-by: Dave Stevenson <dave.stevenson(a)raspberrypi.org>
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Tested-by: Dave Stevenson <dave.stevenson(a)raspberrypi.org>
Reviewed-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Tested-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Cc: stable(a)vger.kernel.org (for 4.14 and up)
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung(a)kernel.org>
drivers/media/v4l2-core/v4l2-event.c | 43 ++++++++++++++++++++----------------
1 file changed, 24 insertions(+), 19 deletions(-)
---
diff --git a/drivers/media/v4l2-core/v4l2-event.c b/drivers/media/v4l2-core/v4l2-event.c
index a3ef1f50a4b3..481e3c65cf97 100644
--- a/drivers/media/v4l2-core/v4l2-event.c
+++ b/drivers/media/v4l2-core/v4l2-event.c
@@ -193,6 +193,22 @@ int v4l2_event_pending(struct v4l2_fh *fh)
}
EXPORT_SYMBOL_GPL(v4l2_event_pending);
+static void __v4l2_event_unsubscribe(struct v4l2_subscribed_event *sev)
+{
+ struct v4l2_fh *fh = sev->fh;
+ unsigned int i;
+
+ lockdep_assert_held(&fh->subscribe_lock);
+ assert_spin_locked(&fh->vdev->fh_lock);
+
+ /* Remove any pending events for this subscription */
+ for (i = 0; i < sev->in_use; i++) {
+ list_del(&sev->events[sev_pos(sev, i)].list);
+ fh->navailable--;
+ }
+ list_del(&sev->list);
+}
+
int v4l2_event_subscribe(struct v4l2_fh *fh,
const struct v4l2_event_subscription *sub, unsigned elems,
const struct v4l2_subscribed_event_ops *ops)
@@ -224,27 +240,23 @@ int v4l2_event_subscribe(struct v4l2_fh *fh,
spin_lock_irqsave(&fh->vdev->fh_lock, flags);
found_ev = v4l2_event_subscribed(fh, sub->type, sub->id);
+ if (!found_ev)
+ list_add(&sev->list, &fh->subscribed);
spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
if (found_ev) {
/* Already listening */
kvfree(sev);
- goto out_unlock;
- }
-
- if (sev->ops && sev->ops->add) {
+ } else if (sev->ops && sev->ops->add) {
ret = sev->ops->add(sev, elems);
if (ret) {
+ spin_lock_irqsave(&fh->vdev->fh_lock, flags);
+ __v4l2_event_unsubscribe(sev);
+ spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
kvfree(sev);
- goto out_unlock;
}
}
- spin_lock_irqsave(&fh->vdev->fh_lock, flags);
- list_add(&sev->list, &fh->subscribed);
- spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
-
-out_unlock:
mutex_unlock(&fh->subscribe_lock);
return ret;
@@ -279,7 +291,6 @@ int v4l2_event_unsubscribe(struct v4l2_fh *fh,
{
struct v4l2_subscribed_event *sev;
unsigned long flags;
- int i;
if (sub->type == V4L2_EVENT_ALL) {
v4l2_event_unsubscribe_all(fh);
@@ -291,14 +302,8 @@ int v4l2_event_unsubscribe(struct v4l2_fh *fh,
spin_lock_irqsave(&fh->vdev->fh_lock, flags);
sev = v4l2_event_subscribed(fh, sub->type, sub->id);
- if (sev != NULL) {
- /* Remove any pending events for this subscription */
- for (i = 0; i < sev->in_use; i++) {
- list_del(&sev->events[sev_pos(sev, i)].list);
- fh->navailable--;
- }
- list_del(&sev->list);
- }
+ if (sev != NULL)
+ __v4l2_event_unsubscribe(sev);
spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: v4l: event: Add subscription to list before calling "add" operation
Author: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Date: Mon Nov 5 09:35:44 2018 -0500
Patch ad608fbcf166 changed how events were subscribed to address an issue
elsewhere. As a side effect of that change, the "add" callback was called
before the event subscription was added to the list of subscribed events,
causing the first event queued by the add callback (and possibly other
events arriving soon afterwards) to be lost.
Fix this by adding the subscription to the list before calling the "add"
callback, and clean up afterwards if that fails.
Fixes: ad608fbcf166 ("media: v4l: event: Prevent freeing event subscriptions while accessed")
Reported-by: Dave Stevenson <dave.stevenson(a)raspberrypi.org>
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Tested-by: Dave Stevenson <dave.stevenson(a)raspberrypi.org>
Reviewed-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Tested-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Cc: stable(a)vger.kernel.org (for 4.14 and up)
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung(a)kernel.org>
drivers/media/v4l2-core/v4l2-event.c | 43 ++++++++++++++++++++----------------
1 file changed, 24 insertions(+), 19 deletions(-)
---
diff --git a/drivers/media/v4l2-core/v4l2-event.c b/drivers/media/v4l2-core/v4l2-event.c
index a3ef1f50a4b3..481e3c65cf97 100644
--- a/drivers/media/v4l2-core/v4l2-event.c
+++ b/drivers/media/v4l2-core/v4l2-event.c
@@ -193,6 +193,22 @@ int v4l2_event_pending(struct v4l2_fh *fh)
}
EXPORT_SYMBOL_GPL(v4l2_event_pending);
+static void __v4l2_event_unsubscribe(struct v4l2_subscribed_event *sev)
+{
+ struct v4l2_fh *fh = sev->fh;
+ unsigned int i;
+
+ lockdep_assert_held(&fh->subscribe_lock);
+ assert_spin_locked(&fh->vdev->fh_lock);
+
+ /* Remove any pending events for this subscription */
+ for (i = 0; i < sev->in_use; i++) {
+ list_del(&sev->events[sev_pos(sev, i)].list);
+ fh->navailable--;
+ }
+ list_del(&sev->list);
+}
+
int v4l2_event_subscribe(struct v4l2_fh *fh,
const struct v4l2_event_subscription *sub, unsigned elems,
const struct v4l2_subscribed_event_ops *ops)
@@ -224,27 +240,23 @@ int v4l2_event_subscribe(struct v4l2_fh *fh,
spin_lock_irqsave(&fh->vdev->fh_lock, flags);
found_ev = v4l2_event_subscribed(fh, sub->type, sub->id);
+ if (!found_ev)
+ list_add(&sev->list, &fh->subscribed);
spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
if (found_ev) {
/* Already listening */
kvfree(sev);
- goto out_unlock;
- }
-
- if (sev->ops && sev->ops->add) {
+ } else if (sev->ops && sev->ops->add) {
ret = sev->ops->add(sev, elems);
if (ret) {
+ spin_lock_irqsave(&fh->vdev->fh_lock, flags);
+ __v4l2_event_unsubscribe(sev);
+ spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
kvfree(sev);
- goto out_unlock;
}
}
- spin_lock_irqsave(&fh->vdev->fh_lock, flags);
- list_add(&sev->list, &fh->subscribed);
- spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
-
-out_unlock:
mutex_unlock(&fh->subscribe_lock);
return ret;
@@ -279,7 +291,6 @@ int v4l2_event_unsubscribe(struct v4l2_fh *fh,
{
struct v4l2_subscribed_event *sev;
unsigned long flags;
- int i;
if (sub->type == V4L2_EVENT_ALL) {
v4l2_event_unsubscribe_all(fh);
@@ -291,14 +302,8 @@ int v4l2_event_unsubscribe(struct v4l2_fh *fh,
spin_lock_irqsave(&fh->vdev->fh_lock, flags);
sev = v4l2_event_subscribed(fh, sub->type, sub->id);
- if (sev != NULL) {
- /* Remove any pending events for this subscription */
- for (i = 0; i < sev->in_use; i++) {
- list_del(&sev->events[sev_pos(sev, i)].list);
- fh->navailable--;
- }
- list_del(&sev->list);
- }
+ if (sev != NULL)
+ __v4l2_event_unsubscribe(sev);
spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
From: Thomas Richter <tmricht(a)linux.ibm.com>
On s390 the CPU Measurement Facility for counters now supports
2 PMUs named cpum_cf (CPU Measurement Facility for counters) and
cpum_cf_diag (CPU Measurement Facility for diagnostic counters)
for one and the same CPU.
Running command
[root@s35lp76 perf]# ./perf stat -e tx_c_tend \
-- ~/mytests/cf-tx-events 1
Measuring transactions
TX_C_TABORT_NO_SPECIAL: 0 expected:0
TX_C_TABORT_SPECIAL: 0 expected:0
TX_C_TEND: 1 expected:1
TX_NC_TABORT: 11 expected:11
TX_NC_TEND: 1 expected:1
Performance counter stats for '/root/mytests/cf-tx-events 1':
2 tx_c_tend
0.002120091 seconds time elapsed
0.000121000 seconds user
0.002127000 seconds sys
[root@s35lp76 perf]#
displays output which is unexpected (and wrong):
2 tx_c_tend
The test program definitely triggers only one transaction, as shown
in line 'TX_C_TEND: 1 expected:1'.
This is caused by the following call sequence:
pmu_lookup() scans and installs a PMU.
+--> pmu_aliases() parses all aliases in directory
.../<pmu-name>/events/* which are file names.
+--> pmu_aliases_parse() Read each file in directory and create
an new alias entry. This is done with
+--> perf_pmu__new_alias() and
+--> __perf_pmu__new_alias() which also check for
identical alias names.
After pmu_aliases() returns, a complete list of event names
for this pmu has been created. Now function
pmu_add_cpu_aliases() is called to add the events listed in the json
| files to the alias list of the cpu.
+--> perf_pmu__find_map() Returns a pointer to the json events.
Now function pmu_add_cpu_aliases() scans through all events listed
in the JSON files for this CPU.
Each json event pmu name is compared with the current PMU being
built up and if they mismatch, the json event is added to the
current PMUs alias list.
To avoid duplicate entries the following comparison is done:
if (!is_arm_pmu_core(name)) {
pname = pe->pmu ? pe->pmu : "cpu";
if (strncmp(pname, name, strlen(pname)))
continue;
}
The culprit is the strncmp() function.
Using current s390 PMU naming, the first PMU is 'cpum_cf'
and a long list of events is added, among them 'tx_c_tend'
When the second PMU named 'cpum_cf_diag' is added, only one event
named 'CF_DIAG' is added by the pmu_aliases() function.
Now function pmu_add_cpu_aliases() is invoked for PMU 'cpum_cf_diag'.
Since the CPUID string is the same for both PMUs, json file events
for PMU named 'cpum_cf' are added to the PMU 'cpm_cf_diag'
This happens because the strncmp() actually compares:
strncmp("cpum_cf", "cpum_cf_diag", 6);
The first parameter is the pmu name taken from the event in
the json file. The second parameter is the pmu name of the PMU
currently being built.
They are different, but the length of the compare only tests the
common prefix and this returns 0(true) when it should return false.
Now all events for PMU cpum_cf are added to the alias list for pmu
cpum_cf_diag.
Later on in function parse_events_add_pmu() the event 'tx_c_end' is
searched in all available PMUs and found twice, adding it two
times to the evsel_list global variable which is the root
of all events. This results in a counter value of 2 instead
of 1.
Output with this patch:
[root@s35lp76 perf]# ./perf stat -e tx_c_tend \
-- ~/mytests/cf-tx-events 1
Measuring transactions
TX_C_TABORT_NO_SPECIAL: 0 expected:0
TX_C_TABORT_SPECIAL: 0 expected:0
TX_C_TEND: 1 expected:1
TX_NC_TABORT: 11 expected:11
TX_NC_TEND: 1 expected:1
Performance counter stats for '/root/mytests/cf-tx-events 1':
1 tx_c_tend
0.001815365 seconds time elapsed
0.000123000 seconds user
0.001756000 seconds sys
[root@s35lp76 perf]#
Signed-off-by: Thomas Richter <tmricht(a)linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner(a)linux.ibm.com>
Reviewed-by: Sebastien Boisvert <sboisvert(a)gydle.com>
Cc: Heiko Carstens <heiko.carstens(a)de.ibm.com>
Cc: Kan Liang <kan.liang(a)linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
Cc: stable(a)vger.kernel.org
Fixes: 292c34c10249 ("perf pmu: Fix core PMU alias list for X86 platform")
Link: http://lkml.kernel.org/r/20181023151616.78193-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
---
tools/perf/util/pmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 7799788f662f..7e49baad304d 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -773,7 +773,7 @@ static void pmu_add_cpu_aliases(struct list_head *head, struct perf_pmu *pmu)
if (!is_arm_pmu_core(name)) {
pname = pe->pmu ? pe->pmu : "cpu";
- if (strncmp(pname, name, strlen(pname)))
+ if (strcmp(pname, name))
continue;
}
--
2.14.4
Hi, all,
I found the following problem (attached to the end) when testing stable-4.4 with
Syzkaller. This is not an easy-to-trigger problem, so the tool does not generate
code for recurring problems.
>From the call stack, it is because the first parameter in ktime_sub is large, and
the second parameter offset is a negative number, causing the final result to
overflow into the sign bit and become a large negative number.
--------------
...
ktime_t expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
...
--------------
But I don't know how to fix this problem. The mainline code is also different from
stable-4.4, and I have not found a patch to fix this problem in the mainline
repository.
So I am a bit confused about how to fix it. Can anyone give me some advice?
Thanks.
Xiaojun.
================================================================================
UBSAN: Undefined behaviour in kernel/time/hrtimer.c:615:20
signed integer overflow:
9223372036854775807 - -495588161 cannot be represented in type 'long long int'
CPU: 0 PID: 4542 Comm: syz-executor0 Not tainted 4.4.156-514.55.6.9.x86_64+ #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
1ffff100391dbf45 ad071d3307b76e03 ffff8801c8edfab0 ffffffff81c9f586
0000000041b58ab3 ffffffff831fd4e6 ffffffff81c9f478 ffff8801c8edfad8
ffff8801c8edfa78 00000000000014a9 ad071d3307b76e03 ffffffff837fd660
Call Trace:
[<ffffffff81c9f586>] __dump_stack lib/dump_stack.c:15 [inline]
[<ffffffff81c9f586>] dump_stack+0x10e/0x1a8 lib/dump_stack.c:51
[<ffffffff81d814a6>] ubsan_epilogue+0x12/0x8f lib/ubsan.c:164
[<ffffffff81d830a1>] handle_overflow+0x23e/0x299 lib/ubsan.c:195
[<ffffffff81d83157>] __ubsan_handle_sub_overflow+0x2a/0x31 lib/ubsan.c:211
[<ffffffff813d8c33>] hrtimer_reprogram kernel/time/hrtimer.c:615 [inline]
[<ffffffff813d8c33>] hrtimer_start_range_ns+0x1083/0x1580 kernel/time/hrtimer.c:1024
[<ffffffff813fde1f>] hrtimer_start include/linux/hrtimer.h:393 [inline]
[<ffffffff813fde1f>] alarm_start+0xcf/0x130 kernel/time/alarmtimer.c:328
[<ffffffff813fed66>] alarm_timer_set+0x296/0x4a0 kernel/time/alarmtimer.c:632
[<ffffffff813e1a3e>] SYSC_timer_settime kernel/time/posix-timers.c:914 [inline]
[<ffffffff813e1a3e>] SyS_timer_settime+0x2be/0x3d0 kernel/time/posix-timers.c:885
[<ffffffff82c2fb61>] entry_SYSCALL_64_fastpath+0x1e/0x9e
================================================================================
================================================================================
UBSAN: Undefined behaviour in kernel/time/hrtimer.c:490:13
signed integer overflow:
9223372036854775807 - -495588161 cannot be represented in type 'long long int'
CPU: 0 PID: 4542 Comm: syz-executor0 Not tainted 4.4.156-514.55.6.9.x86_64+ #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
1ffff1003ed40f8b ad071d3307b76e03 ffff8801f6a07ce0 ffffffff81c9f586
0000000041b58ab3 ffffffff831fd4e6 ffffffff81c9f478 ffff8801f6a07d08
ffff8801f6a07ca8 000000000000000a ad071d3307b76e03 ffffffff837fd660
Call Trace:
<IRQ> [<ffffffff81c9f586>] __dump_stack lib/dump_stack.c:15 [inline]
<IRQ> [<ffffffff81c9f586>] dump_stack+0x10e/0x1a8 lib/dump_stack.c:51
[<ffffffff81d814a6>] ubsan_epilogue+0x12/0x8f lib/ubsan.c:164
[<ffffffff81d830a1>] handle_overflow+0x23e/0x299 lib/ubsan.c:195
[<ffffffff81d83157>] __ubsan_handle_sub_overflow+0x2a/0x31 lib/ubsan.c:211
[<ffffffff813d43ea>] __hrtimer_get_next_event+0x1da/0x2b0 kernel/time/hrtimer.c:490
[<ffffffff813d9532>] hrtimer_interrupt+0x202/0x580 kernel/time/hrtimer.c:1361
[<ffffffff8113e7ad>] local_apic_timer_interrupt+0x9d/0x150 arch/x86/kernel/apic/apic.c:901
[<ffffffff82c32ea0>] smp_apic_timer_interrupt+0x80/0xb0 arch/x86/kernel/apic/apic.c:925
[<ffffffff82c30ac5>] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:563
<EOI> [<ffffffff82c2f0fb>] ? arch_local_irq_restore arch/x86/include/asm/paravirt.h:812 [inline]
<EOI> [<ffffffff82c2f0fb>] ? __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:162 [inline]
<EOI> [<ffffffff82c2f0fb>] ? _raw_spin_unlock_irqrestore+0x3b/0x60 kernel/locking/spinlock.c:191
[<ffffffff813e1a4f>] unlock_timer include/linux/spinlock.h:362 [inline]
[<ffffffff813e1a4f>] SYSC_timer_settime kernel/time/posix-timers.c:916 [inline]
[<ffffffff813e1a4f>] SyS_timer_settime+0x2cf/0x3d0 kernel/time/posix-timers.c:885
[<ffffffff82c2fb61>] entry_SYSCALL_64_fastpath+0x1e/0x9e
================================================================================
From: Michal Hocko <mhocko(a)suse.com>
Baoquan He has noticed that 15c30bc09085 ("mm, memory_hotplug: make
has_unmovable_pages more robust") is causing memory offlining failures
on a movable node. After a further debugging it turned out that
has_unmovable_pages fails prematurely because it stumbles over off-LRU
pages. Nevertheless those pages are not on LRU because they are waiting
on the pcp LRU caches (an example of __dump_page added by a debugging
patch)
[ 560.923297] page:ffffea043f39fa80 count:1 mapcount:0 mapping:ffff880e5dce1b59 index:0x7f6eec459
[ 560.931967] flags: 0x5fffffc0080024(uptodate|active|swapbacked)
[ 560.937867] raw: 005fffffc0080024 dead000000000100 dead000000000200 ffff880e5dce1b59
[ 560.945606] raw: 00000007f6eec459 0000000000000000 00000001ffffffff ffff880e43ae8000
[ 560.953323] page dumped because: hotplug
[ 560.957238] page->mem_cgroup:ffff880e43ae8000
[ 560.961620] has_unmovable_pages: pfn:0x10fd030d, found:0x1, count:0x0
[ 560.968127] page:ffffea043f40c340 count:2 mapcount:0 mapping:ffff880e2f2d8628 index:0x0
[ 560.976104] flags: 0x5fffffc0000006(referenced|uptodate)
[ 560.981401] raw: 005fffffc0000006 dead000000000100 dead000000000200 ffff880e2f2d8628
[ 560.989119] raw: 0000000000000000 0000000000000000 00000002ffffffff ffff88010a8f5000
[ 560.996833] page dumped because: hotplug
The issue could be worked around by calling lru_add_drain_all but we can
do better than that. We know that all swap backed pages are migrateable
and the same applies for pages which do implement the migratepage
callback.
Reported-by: Baoquan He <bhe(a)redhat.com>
Fixes: 15c30bc09085 ("mm, memory_hotplug: make has_unmovable_pages more robust")
Cc: stable
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
---
Hi,
we have been discussing issue reported by Baoquan [1] mostly off-list
and he has confirmed the patch solved failures he is seeing. I believe
that has_unmovable_pages begs for a much better implementation and/or
substantial pages isolation design rethinking but let's close the bug
which can be really annoying first.
[1] http://lkml.kernel.org/r/20181101091055.GA15166@MiWiFi-R3L-srv
mm/page_alloc.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 863d46da6586..48ceda313332 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7824,8 +7824,22 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
if (__PageMovable(page))
continue;
- if (!PageLRU(page))
- found++;
+ if (PageLRU(page))
+ continue;
+
+ /*
+ * Some LRU pages might be temporarily off-LRU for all
+ * sort of different reasons - reclaim, migration,
+ * per-cpu LRU caches etc.
+ * Make sure we do not consider those pages to be unmovable.
+ */
+ if (PageSwapBacked(page))
+ continue;
+
+ if (page->mapping && page->mapping->a_ops &&
+ page->mapping->a_ops->migratepage)
+ continue;
+
/*
* If there are RECLAIMABLE pages, we need to check
* it. But now, memory offline itself doesn't call
@@ -7839,7 +7853,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
* is set to both of a memory hole page and a _used_ kernel
* page at boot.
*/
- if (found > count)
+ if (++found > count)
goto unmovable;
}
return false;
--
2.19.1
In OpenWrt Project the flash write error caused on some products.
Also the issue can be fixed by using chip_good() instead of chip_ready().
The chip_ready() just checks the value from flash memory twice.
And the chip_good() checks the value with the expected value.
Probably the issue can be fixed as checked correctly by the chip_good().
So change to use chip_good() instead of chip_ready().
Signed-off-by: Tokunori Ikegami <ikegami(a)allied-telesis.co.jp>
Signed-off-by: Hauke Mehrtens <hauke(a)hauke-m.de>
Signed-off-by: Koen Vandeputte <koen.vandeputte(a)ncentric.com>
Signed-off-by: Fabio Bettoni <fbettoni(a)gmail.com>
Co-Developed-by: Hauke Mehrtens <hauke(a)hauke-m.de>
Co-Developed-by: Koen Vandeputte <koen.vandeputte(a)ncentric.com>
Co-Developed-by: Fabio Bettoni <fbettoni(a)gmail.com>
Reported-by: Fabio Bettoni <fbettoni(a)gmail.com>
Cc: Chris Packham <chris.packham(a)alliedtelesis.co.nz>
Cc: Joakim Tjernlund <Joakim.Tjernlund(a)infinera.com>
Cc: Boris Brezillon <boris.brezillon(a)free-electrons.com>
Cc: linux-mtd(a)lists.infradead.org
Cc: stable(a)vger.kernel.org
---
Changes since v2:
- Just update the commit message for the comment.
Changes since v1:
- Just update the commit message.
Background:
This is required for OpenWrt Project to result the flash write issue as
below patche.
<https://git.openwrt.org/?p=openwrt/openwrt.git;a=commitdiff;h=ddc11c3932c7b…>
Also the original patch in OpenWRT is below.
<https://github.com/openwrt/openwrt/blob/v18.06.0/target/linux/ar71xx/patche…>
The reason to use chip_good() is that just actually fix the issue.
And also in the past I had fixed the erase function also as same way by the
patch below.
<https://patchwork.ozlabs.org/patch/922656/>
Note: The reason for the patch for erase is same.
In my understanding the chip_ready() is just checked the value twice from
flash.
So I think that sometimes incorrect value is read twice and it is depended
on the flash device behavior but not sure..
So change to use chip_good() instead of chip_ready().
drivers/mtd/chips/cfi_cmdset_0002.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c b/drivers/mtd/chips/cfi_cmdset_0002.c
index 72428b6bfc47..251c9e1675bd 100644
--- a/drivers/mtd/chips/cfi_cmdset_0002.c
+++ b/drivers/mtd/chips/cfi_cmdset_0002.c
@@ -1627,31 +1627,37 @@ static int __xipram do_write_oneword(struct map_info *map, struct flchip *chip,
continue;
}
- if (time_after(jiffies, timeo) && !chip_ready(map, adr)){
+ if (chip_good(map, adr, datum))
+ break;
+
+ if (time_after(jiffies, timeo)){
xip_enable(map, chip, adr);
printk(KERN_WARNING "MTD %s(): software timeout\n", __func__);
xip_disable(map, chip, adr);
+ ret = -EIO;
break;
}
- if (chip_ready(map, adr))
- break;
-
/* Latency issues. Drop the lock, wait a while and retry */
UDELAY(map, chip, adr, 1);
}
+
/* Did we succeed? */
- if (!chip_good(map, adr, datum)) {
+ if (ret) {
/* reset on all failures. */
map_write(map, CMD(0xF0), chip->start);
/* FIXME - should have reset delay before continuing */
- if (++retry_cnt <= MAX_RETRIES)
+ if (++retry_cnt <= MAX_RETRIES) {
+ ret = 0;
goto retry;
+ }
ret = -EIO;
}
+
xip_enable(map, chip, adr);
+
op_done:
if (mode == FL_OTP_WRITE)
otp_exit(map, chip, adr, map_bankwidth(map));
--
2.18.0