The patch titled
Subject: ocfs2: free up write context when direct IO failed
has been added to the -mm tree. Its filename is
ocfs2-free-up-write-context-when-direct-io-failed.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/ocfs2-free-up-write-context-when-d…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/ocfs2-free-up-write-context-when-d…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Wengang Wang <wen.gang.wang(a)oracle.com>
Subject: ocfs2: free up write context when direct IO failed
The write context should also be freed even when direct IO failed.
Otherwise a memory leak is introduced and entries remain in
oi->ip_unwritten_list causing the following BUG later in unlink path:
ERROR: bug expression: !list_empty(&oi->ip_unwritten_list)
ERROR: Clear inode of 215043, inode has unwritten extents
...
Call Trace:
? __set_current_blocked+0x42/0x68
ocfs2_evict_inode+0x91/0x6a0 [ocfs2]
? bit_waitqueue+0x40/0x33
evict+0xdb/0x1af
iput+0x1a2/0x1f7
do_unlinkat+0x194/0x28f
SyS_unlinkat+0x1b/0x2f
do_syscall_64+0x79/0x1ae
entry_SYSCALL_64_after_hwframe+0x151/0x0
This patch also logs, with frequency limit, direct IO failures.
Link: http://lkml.kernel.org/r/20181102170632.25921-1-wen.gang.wang@oracle.com
Signed-off-by: Wengang Wang <wen.gang.wang(a)oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi(a)oracle.com>
Reviewed-by: Changwei Ge <ge.changwei(a)h3c.com>
Reviewed-by: Joseph Qi <jiangqi903(a)gmail.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/aops.c | 12 ++++++++++--
fs/ocfs2/cluster/masklog.h | 9 +++++++++
2 files changed, 19 insertions(+), 2 deletions(-)
--- a/fs/ocfs2/aops.c~ocfs2-free-up-write-context-when-direct-io-failed
+++ a/fs/ocfs2/aops.c
@@ -2411,8 +2411,16 @@ static int ocfs2_dio_end_io(struct kiocb
/* this io's submitter should not have unlocked this before we could */
BUG_ON(!ocfs2_iocb_is_rw_locked(iocb));
- if (bytes > 0 && private)
- ret = ocfs2_dio_end_io_write(inode, private, offset, bytes);
+ if (bytes <= 0)
+ mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld",
+ (long long)bytes);
+ if (private) {
+ if (bytes > 0)
+ ret = ocfs2_dio_end_io_write(inode, private, offset,
+ bytes);
+ else
+ ocfs2_dio_free_write_ctx(inode, private);
+ }
ocfs2_iocb_clear_rw_locked(iocb);
--- a/fs/ocfs2/cluster/masklog.h~ocfs2-free-up-write-context-when-direct-io-failed
+++ a/fs/ocfs2/cluster/masklog.h
@@ -178,6 +178,15 @@ do { \
##__VA_ARGS__); \
} while (0)
+#define mlog_ratelimited(mask, fmt, ...) \
+do { \
+ static DEFINE_RATELIMIT_STATE(_rs, \
+ DEFAULT_RATELIMIT_INTERVAL, \
+ DEFAULT_RATELIMIT_BURST); \
+ if (__ratelimit(&_rs)) \
+ mlog(mask, fmt, ##__VA_ARGS__); \
+} while (0)
+
#define mlog_errno(st) ({ \
int _st = (st); \
if (_st != -ERESTARTSYS && _st != -EINTR && \
_
Patches currently in -mm which might be from wen.gang.wang(a)oracle.com are
ocfs2-free-up-write-context-when-direct-io-failed.patch
Commit '88ba95bedb79 ("backlight: pwm_bl: Compute brightness of LED
linearly to human eye")' allows the possibility to compute a default
brightness table when there isn't the brightness-levels property in the
DT. Unfortunately the changes made broke the pwm backlight for the
non-DT boards.
Usually, the non-DT boards don't pass the brightness levels via platform
data, instead, it sets the max_brightness in their platform data and the
driver calculates the level without a table. The ofending patch assumed
hat when there is no brightness levels table we should create one, but this
is clearly wrong for the non-DT case.
After this patch the code handles the DT and the non-DT case taking in
consideration also if max_brightness is set or not.
Fixes: '88ba95bedb79 ("backlight: pwm_bl: Compute brightness of LED linearly to human eye")'
Cc: <stable(a)vger.kernel.org>
Reported-by: Robert Jarzmik <robert.jarzmik(a)free.fr>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo(a)collabora.com>
Tested-by: Robert Jarzmik <robert.jarzmik(a)free.fr>
Acked-by: Daniel Thompson <daniel.thompson(a)linaro.org>
---
Changes in v2:
- Rebase on top of mainline
- Add Tested-by and Acked-by tags.
drivers/video/backlight/pwm_bl.c | 41 +++++++++++++++++++++++++++-----
1 file changed, 35 insertions(+), 6 deletions(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c
index 678b27063198..f9ef0673a083 100644
--- a/drivers/video/backlight/pwm_bl.c
+++ b/drivers/video/backlight/pwm_bl.c
@@ -562,7 +562,30 @@ static int pwm_backlight_probe(struct platform_device *pdev)
goto err_alloc;
}
- if (!data->levels) {
+ if (data->levels) {
+ /*
+ * For the DT case, only when brightness levels is defined
+ * data->levels is filled. For the non-DT case, data->levels
+ * can come from platform data, however is not usual.
+ */
+ for (i = 0; i <= data->max_brightness; i++) {
+ if (data->levels[i] > pb->scale)
+ pb->scale = data->levels[i];
+
+ pb->levels = data->levels;
+ }
+ } else if (!data->max_brightness) {
+ /*
+ * If no brightness levels are provided and max_brightness is
+ * not set, use the default brightness table. For the DT case,
+ * max_brightness is set to 0 when brightness levels is not
+ * specified. For the non-DT case, max_brightness is usually
+ * set to some value.
+ */
+
+ /* Get the PWM period (in nanoseconds) */
+ pwm_get_state(pb->pwm, &state);
+
ret = pwm_backlight_brightness_default(&pdev->dev, data,
state.period);
if (ret < 0) {
@@ -570,13 +593,19 @@ static int pwm_backlight_probe(struct platform_device *pdev)
"failed to setup default brightness table\n");
goto err_alloc;
}
- }
- for (i = 0; i <= data->max_brightness; i++) {
- if (data->levels[i] > pb->scale)
- pb->scale = data->levels[i];
+ for (i = 0; i <= data->max_brightness; i++) {
+ if (data->levels[i] > pb->scale)
+ pb->scale = data->levels[i];
- pb->levels = data->levels;
+ pb->levels = data->levels;
+ }
+ } else {
+ /*
+ * That only happens for the non-DT case, where platform data
+ * sets the max_brightness value.
+ */
+ pb->scale = data->max_brightness;
}
pb->lth_brightness = data->lth_brightness * (state.period / pb->scale);
--
2.19.1
Hello,
on an mxs machine running a v4.9-rt kernel I saw the following
backtrace:
[ 25.309794] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:987
[ 25.309810] in_atomic(): 1, irqs_disabled(): 128, pid: 221, name: mrp62439d
[ 25.309834] CPU: 0 PID: 221 Comm: mrp62439d Not tainted 4.9.109-20180709-1-rt76-00043-g80a57dee2c6b-dirty #43
[ 25.309840] Hardware name: Freescale MXS (Device Tree)
[ 25.309913] [<c000fad8>] (unwind_backtrace) from [<c000df34>] (show_stack+0x18/0x1c)
[ 25.309957] [<c000df34>] (show_stack) from [<c0042924>] (___might_sleep+0xf8/0x164)
[ 25.309996] [<c0042924>] (___might_sleep) from [<c04ec920>] (rt_spin_lock+0x20/0x6c)
[ 25.310039] [<c04ec920>] (rt_spin_lock) from [<c02e5a20>] (gpio_to_desc+0x1c/0xbc)
[ 25.310076] [<c02e5a20>] (gpio_to_desc) from [<c02ec358>] (mxs_gpio_set_irq_type+0x114/0x194)
[ 25.310107] [<c02ec358>] (mxs_gpio_set_irq_type) from [<c00558d8>] (__irq_set_trigger+0x64/0x158)
[ 25.310132] [<c00558d8>] (__irq_set_trigger) from [<c0055f74>] (__setup_irq+0x5a8/0x694)
[ 25.310156] [<c0055f74>] (__setup_irq) from [<c0056408>] (request_threaded_irq+0xe4/0x18c)
[ 25.310187] [<c0056408>] (request_threaded_irq) from [<c02e8650>] (gpio_ioctl+0x304/0x680)
[ 25.310217] [<c02e8650>] (gpio_ioctl) from [<c01476a4>] (do_vfs_ioctl+0x98/0x970)
[ 25.310241] [<c01476a4>] (do_vfs_ioctl) from [<c0147fb8>] (SyS_ioctl+0x3c/0x64)
[ 25.310264] [<c0147fb8>] (SyS_ioctl) from [<c000a4d8>] (__sys_trace_return+0x0/0x10)
The problem is that __irq_set_trigger is called in atomic context (even on rt)
and mxs_gpio_set_irq_type calls gpio_get_value which takes the
(sleeping on rt) gpio_lock.
I didn't test 833eacc7b5913da9896bacd30db7d490aa777868 yet (I used
val = readl(port->base + PINCTRL_DIN(port)) & pin_mask;
instead) but this commit should have the same effect.
Can you please backport this? (There are maybe some more similar
problems that were introduced in v4.5-rc1 / 0f4630f3720e7e, I didn't
check these though.)
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | http://www.pengutronix.de/ |
From: Huacai Chen <chenhc(a)lemote.com>
[ Upstream commit c61c7def1fa0a722610d89790e0255b74f3c07dd ]
Commit ea7e0480a4b6 ("MIPS: VDSO: Always map near top of user memory")
set VDSO_RANDOMIZE_SIZE to 256MB for 64bit kernel. But take a look at
arch/mips/mm/mmap.c we can see that MIN_GAP is 128MB, which means the
mmap_base may be at (user_address_top - 128MB). This make the stack be
surrounded by mmaped areas, then stack expanding fails and causes a
segmentation fault. Therefore, VDSO_RANDOMIZE_SIZE should be less than
MIN_GAP and this patch reduce it to 64MB.
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
Signed-off-by: Paul Burton <paul.burton(a)mips.com>
Fixes: ea7e0480a4b6 ("MIPS: VDSO: Always map near top of user memory")
Patchwork: https://patchwork.linux-mips.org/patch/20910/
Cc: Ralf Baechle <ralf(a)linux-mips.org>
Cc: James Hogan <jhogan(a)kernel.org>
Cc: linux-mips(a)linux-mips.org
Cc: Fuxin Zhang <zhangfx(a)lemote.com>
Cc: Zhangjin Wu <wuzhangjin(a)gmail.com>
Cc: Huacai Chen <chenhuacai(a)gmail.com>
Cc: stable(a)vger.kernel.org # 4.19
---
arch/mips/include/asm/processor.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
index 49d6046ca1d0..c373eb605040 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -81,7 +81,7 @@ extern unsigned int vced_count, vcei_count;
#endif
-#define VDSO_RANDOMIZE_SIZE (TASK_IS_32BIT_ADDR ? SZ_1M : SZ_256M)
+#define VDSO_RANDOMIZE_SIZE (TASK_IS_32BIT_ADDR ? SZ_1M : SZ_64M)
extern unsigned long mips_stack_top(void);
#define STACK_TOP mips_stack_top()
--
2.19.1
On Tue, Nov 6, 2018 at 3:06 PM Petr Vorel <pvorel(a)suse.cz> wrote:
>
> Hi Amir,
>
> > There must be some confusion.
> > FAN_MARK_MOUNT was NOT added in v4.19-rc2.
> > It has been there from the start.
> > FAN_MARK_INODE was NOT added either
> > the define FAN_MARK_INODE 0 is just a convenience readability define
> > it does not change the API.
> I'm sorry, you're right.
>
> > You may be confusing with FAN_MARK_FILESYSTEM
> > just was just added in kernel v4.20-rc1.
> > The extension of tests to cover FAN_MARK_FILESYSTEM
> > is waiting in my queue:
> > https://github.com/amir73il/ltp/commits/fanotify_sb
>
> > And it already includes runtime checks for FAN_MARK_FILESYSTEM
> > support.
>
> > Did I miss anything?
> Testing your branch on older kernel, fanotify10 fails earlier than new TCONF
Because fanotify10 checks for a bug that existed since the beginning
and fixed by 9bdda4e9cf2d fsnotify: fix ignore mask logic in fsnotify().
So test SHOULD fail until the backported patch is applied to the old kernel.
The patch does not apply cleanly to kernels <= v4.17.
Tested backport patch for v4.14.y attached.
Thanks,
Amir.
When the platform BIOS is unable to report all the media error records
it requires the OS to restart the scrub at a prescribed location. The
driver detects the overflow condition, but then fails to report it to
the ARS state machine after reaping the records. Propagate -ENOSPC
correctly to continue the ARS operation.
Cc: <stable(a)vger.kernel.org>
Fixes: 1cf03c00e7c1 ("nfit: scrub and register regions in a workqueue")
Reported-by: Jacek Zloch <jacek.zloch(a)intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
drivers/acpi/nfit/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index f8c638f3c946..5970b8f5f768 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -2928,9 +2928,9 @@ static int acpi_nfit_query_poison(struct acpi_nfit_desc *acpi_desc)
return rc;
if (ars_status_process_records(acpi_desc))
- return -ENOMEM;
+ dev_err(acpi_desc->dev, "Failed to process ARS records\n");
- return 0;
+ return rc;
}
static int ars_register(struct acpi_nfit_desc *acpi_desc,
From: Masahisa Kojima <masahisa.kojima(a)linaro.org>
[ Upstream commit 8d5b0bf611ec5b7618d5b772dddc93b8afa78cb8 ]
We observed that packets and bytes count are not reset
when user performs interface down. Eventually, tx queue is
exhausted and packets will not be sent out.
To avoid this problem, resets tx queue in ndo_stop.
Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver")
Signed-off-by: Masahisa Kojima <masahisa.kojima(a)linaro.org>
Signed-off-by: Yoshitoyo Osaki <osaki.yoshitoyo(a)socionext.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/ethernet/socionext/netsec.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c
index 4289ccb26e4e..d2caeb9edc04 100644
--- a/drivers/net/ethernet/socionext/netsec.c
+++ b/drivers/net/ethernet/socionext/netsec.c
@@ -940,6 +940,9 @@ static void netsec_uninit_pkt_dring(struct netsec_priv *priv, int id)
dring->head = 0;
dring->tail = 0;
dring->pkt_cnt = 0;
+
+ if (id == NETSEC_RING_TX)
+ netdev_reset_queue(priv->ndev);
}
static void netsec_free_dring(struct netsec_priv *priv, int id)
--
2.17.1
From: Xunlei Pang <xlpang(a)redhat.com>
On some of our systems, we notice this error popping up on occasion,
completely hanging the system.
[<ffffffc0000ee398>] enqueue_task_dl+0x1f0/0x420
[<ffffffc0000d0f14>] activate_task+0x7c/0x90
[<ffffffc0000edbdc>] push_dl_task+0x164/0x1c8
[<ffffffc0000edc60>] push_dl_tasks+0x20/0x30
[<ffffffc0000cc00c>] __balance_callback+0x44/0x68
[<ffffffc000d2c018>] __schedule+0x6f0/0x728
[<ffffffc000d2c278>] schedule+0x78/0x98
[<ffffffc000d2e76c>] __rt_mutex_slowlock+0x9c/0x108
[<ffffffc000d2e9d0>] rt_mutex_slowlock+0xd8/0x198
[<ffffffc0000f7f28>] rt_mutex_timed_futex_lock+0x30/0x40
[<ffffffc00012c1a8>] futex_lock_pi+0x200/0x3b0
[<ffffffc00012cf84>] do_futex+0x1c4/0x550
It runs an 4.4 kernel on an arm64 rig. The signature looks suspciously
similar to what Xuneli Pang observed in his crash, and with this fix, my
issue goes away (my system has survivied approx 1500 reboots and a few
nasty tests so far)
Alongside this patch in the tree, there are a few other bits and pieces
pertaining to futex, rtmutex and kernel/sched/, but those patches
creates
weird crashes that I have not been able to dissect yet. Once (if) I have
been able to figure those out (and test), they will be sent later.
I am sure other users of LTS that also use sched_deadline will run into
this issue, so I think it is a good candidate for 4.4-stable. Possibly
also
to 4.9 and 4.14, but I have not had time to test for those versions.
Apart from a minor conflict in sched.h, the patch applied cleanly.
(Tested on arm64 running 4.4.<late-ish>)
-Henrik
A crash happened while I was playing with deadline PI rtmutex.
BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
IP: [<ffffffff810eeb8f>] rt_mutex_get_top_task+0x1f/0x30
PGD 232a75067 PUD 230947067 PMD 0
Oops: 0000 [#1] SMP
CPU: 1 PID: 10994 Comm: a.out Not tainted
Call Trace:
[<ffffffff810b658c>] enqueue_task+0x2c/0x80
[<ffffffff810ba763>] activate_task+0x23/0x30
[<ffffffff810d0ab5>] pull_dl_task+0x1d5/0x260
[<ffffffff810d0be6>] pre_schedule_dl+0x16/0x20
[<ffffffff8164e783>] __schedule+0xd3/0x900
[<ffffffff8164efd9>] schedule+0x29/0x70
[<ffffffff8165035b>] __rt_mutex_slowlock+0x4b/0xc0
[<ffffffff81650501>] rt_mutex_slowlock+0xd1/0x190
[<ffffffff810eeb33>] rt_mutex_timed_lock+0x53/0x60
[<ffffffff810ecbfc>] futex_lock_pi.isra.18+0x28c/0x390
[<ffffffff810ed8b0>] do_futex+0x190/0x5b0
[<ffffffff810edd50>] SyS_futex+0x80/0x180
This is because rt_mutex_enqueue_pi() and rt_mutex_dequeue_pi()
are only protected by pi_lock when operating pi waiters, while
rt_mutex_get_top_task(), will access them with rq lock held but
not holding pi_lock.
In order to tackle it, we introduce new "pi_top_task" pointer
cached in task_struct, and add new rt_mutex_update_top_task()
to update its value, it can be called by rt_mutex_setprio()
which held both owner's pi_lock and rq lock. Thus "pi_top_task"
can be safely accessed by enqueue_task_dl() under rq lock.
Originally-From: Peter Zijlstra <peterz(a)infradead.org>
Signed-off-by: Xunlei Pang <xlpang(a)redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Acked-by: Steven Rostedt <rostedt(a)goodmis.org>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: juri.lelli(a)arm.com
Cc: bigeasy(a)linutronix.de
Cc: mathieu.desnoyers(a)efficios.com
Cc: jdesfossez(a)efficios.com
Cc: bristot(a)redhat.com
Link: http://lkml.kernel.org/r/20170323150216.157682758@infradead.org
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
(cherry picked from commit e96a7705e7d3fef96aec9b590c63b2f6f7d2ba22)
Conflicts:
include/linux/sched.h
Backported-and-tested-by: Henrik Austad <haustad(a)cisco.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/init_task.h | 1 +
include/linux/sched.h | 2 ++
include/linux/sched/rt.h | 1 +
kernel/fork.c | 1 +
kernel/locking/rtmutex.c | 29 +++++++++++++++++++++--------
kernel/sched/core.c | 2 ++
6 files changed, 28 insertions(+), 8 deletions(-)
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 1c1ff7e4faa4..a561ce0c5d7f 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -162,6 +162,7 @@ extern struct task_group root_task_group;
#ifdef CONFIG_RT_MUTEXES
# define INIT_RT_MUTEXES(tsk) \
.pi_waiters = RB_ROOT, \
+ .pi_top_task = NULL, \
.pi_waiters_leftmost = NULL,
#else
# define INIT_RT_MUTEXES(tsk)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index a464ba71a993..19a3f946caf0 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1628,6 +1628,8 @@ struct task_struct {
/* PI waiters blocked on a rt_mutex held by this task */
struct rb_root pi_waiters;
struct rb_node *pi_waiters_leftmost;
+ /* Updated under owner's pi_lock and rq lock */
+ struct task_struct *pi_top_task;
/* Deadlock detection and priority inheritance handling */
struct rt_mutex_waiter *pi_blocked_on;
#endif
diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
index a30b172df6e1..60d0c4740b9f 100644
--- a/include/linux/sched/rt.h
+++ b/include/linux/sched/rt.h
@@ -19,6 +19,7 @@ static inline int rt_task(struct task_struct *p)
extern int rt_mutex_getprio(struct task_struct *p);
extern void rt_mutex_setprio(struct task_struct *p, int prio);
extern int rt_mutex_get_effective_prio(struct task_struct *task, int newprio);
+extern void rt_mutex_update_top_task(struct task_struct *p);
extern struct task_struct *rt_mutex_get_top_task(struct task_struct *task);
extern void rt_mutex_adjust_pi(struct task_struct *p);
static inline bool tsk_is_pi_blocked(struct task_struct *tsk)
diff --git a/kernel/fork.c b/kernel/fork.c
index ac00f14208b7..1e5d15defe25 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1240,6 +1240,7 @@ static void rt_mutex_init_task(struct task_struct *p)
#ifdef CONFIG_RT_MUTEXES
p->pi_waiters = RB_ROOT;
p->pi_waiters_leftmost = NULL;
+ p->pi_top_task = NULL;
p->pi_blocked_on = NULL;
#endif
}
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index b066724d7a5b..8ddce9fb6724 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -319,6 +319,19 @@ rt_mutex_dequeue_pi(struct task_struct *task, struct rt_mutex_waiter *waiter)
}
/*
+ * Must hold both p->pi_lock and task_rq(p)->lock.
+ */
+void rt_mutex_update_top_task(struct task_struct *p)
+{
+ if (!task_has_pi_waiters(p)) {
+ p->pi_top_task = NULL;
+ return;
+ }
+
+ p->pi_top_task = task_top_pi_waiter(p)->task;
+}
+
+/*
* Calculate task priority from the waiter tree priority
*
* Return task->normal_prio when the waiter tree is empty or when
@@ -333,12 +346,12 @@ int rt_mutex_getprio(struct task_struct *task)
task->normal_prio);
}
+/*
+ * Must hold either p->pi_lock or task_rq(p)->lock.
+ */
struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
{
- if (likely(!task_has_pi_waiters(task)))
- return NULL;
-
- return task_top_pi_waiter(task)->task;
+ return task->pi_top_task;
}
/*
@@ -347,12 +360,12 @@ struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
*/
int rt_mutex_get_effective_prio(struct task_struct *task, int newprio)
{
- if (!task_has_pi_waiters(task))
+ struct task_struct *top_task = rt_mutex_get_top_task(task);
+
+ if (!top_task)
return newprio;
- if (task_top_pi_waiter(task)->task->prio <= newprio)
- return task_top_pi_waiter(task)->task->prio;
- return newprio;
+ return min(top_task->prio, newprio);
}
/*
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4d1511ad3753..4e23579cc38f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3428,6 +3428,8 @@ void rt_mutex_setprio(struct task_struct *p, int prio)
goto out_unlock;
}
+ rt_mutex_update_top_task(p);
+
trace_sched_pi_setprio(p, prio);
oldprio = p->prio;
prev_class = p->sched_class;
--
2.11.0
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
We deinit the lpe audio device before we call
drm_atomic_helper_shutdown(), which means the platform device
may already be gone when it comes time to shut down the crtc.
As we don't know when the last reference to the platform
device gets dropped by the audio driver we can't assume that
the device and its data are still around when turning off the
crtc. Mark the platform device as gone as soon as we do the
audio deinit.
Cc: stable(a)vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/intel_lpe_audio.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_lpe_audio.c b/drivers/gpu/drm/i915/intel_lpe_audio.c
index cdf19553ffac..5d5336fbe7b0 100644
--- a/drivers/gpu/drm/i915/intel_lpe_audio.c
+++ b/drivers/gpu/drm/i915/intel_lpe_audio.c
@@ -297,8 +297,10 @@ void intel_lpe_audio_teardown(struct drm_i915_private *dev_priv)
lpe_audio_platdev_destroy(dev_priv);
irq_free_desc(dev_priv->lpe_audio.irq);
-}
+ dev_priv->lpe_audio.irq = -1;
+ dev_priv->lpe_audio.platdev = NULL;
+}
/**
* intel_lpe_audio_notify() - notify lpe audio event
--
2.18.1
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: v4l: event: Add subscription to list before calling "add" operation
Author: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Date: Mon Nov 5 09:35:44 2018 -0500
Patch ad608fbcf166 changed how events were subscribed to address an issue
elsewhere. As a side effect of that change, the "add" callback was called
before the event subscription was added to the list of subscribed events,
causing the first event queued by the add callback (and possibly other
events arriving soon afterwards) to be lost.
Fix this by adding the subscription to the list before calling the "add"
callback, and clean up afterwards if that fails.
Fixes: ad608fbcf166 ("media: v4l: event: Prevent freeing event subscriptions while accessed")
Reported-by: Dave Stevenson <dave.stevenson(a)raspberrypi.org>
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Tested-by: Dave Stevenson <dave.stevenson(a)raspberrypi.org>
Reviewed-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Tested-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Cc: stable(a)vger.kernel.org (for 4.14 and up)
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung(a)kernel.org>
drivers/media/v4l2-core/v4l2-event.c | 43 ++++++++++++++++++++----------------
1 file changed, 24 insertions(+), 19 deletions(-)
---
diff --git a/drivers/media/v4l2-core/v4l2-event.c b/drivers/media/v4l2-core/v4l2-event.c
index a3ef1f50a4b3..481e3c65cf97 100644
--- a/drivers/media/v4l2-core/v4l2-event.c
+++ b/drivers/media/v4l2-core/v4l2-event.c
@@ -193,6 +193,22 @@ int v4l2_event_pending(struct v4l2_fh *fh)
}
EXPORT_SYMBOL_GPL(v4l2_event_pending);
+static void __v4l2_event_unsubscribe(struct v4l2_subscribed_event *sev)
+{
+ struct v4l2_fh *fh = sev->fh;
+ unsigned int i;
+
+ lockdep_assert_held(&fh->subscribe_lock);
+ assert_spin_locked(&fh->vdev->fh_lock);
+
+ /* Remove any pending events for this subscription */
+ for (i = 0; i < sev->in_use; i++) {
+ list_del(&sev->events[sev_pos(sev, i)].list);
+ fh->navailable--;
+ }
+ list_del(&sev->list);
+}
+
int v4l2_event_subscribe(struct v4l2_fh *fh,
const struct v4l2_event_subscription *sub, unsigned elems,
const struct v4l2_subscribed_event_ops *ops)
@@ -224,27 +240,23 @@ int v4l2_event_subscribe(struct v4l2_fh *fh,
spin_lock_irqsave(&fh->vdev->fh_lock, flags);
found_ev = v4l2_event_subscribed(fh, sub->type, sub->id);
+ if (!found_ev)
+ list_add(&sev->list, &fh->subscribed);
spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
if (found_ev) {
/* Already listening */
kvfree(sev);
- goto out_unlock;
- }
-
- if (sev->ops && sev->ops->add) {
+ } else if (sev->ops && sev->ops->add) {
ret = sev->ops->add(sev, elems);
if (ret) {
+ spin_lock_irqsave(&fh->vdev->fh_lock, flags);
+ __v4l2_event_unsubscribe(sev);
+ spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
kvfree(sev);
- goto out_unlock;
}
}
- spin_lock_irqsave(&fh->vdev->fh_lock, flags);
- list_add(&sev->list, &fh->subscribed);
- spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
-
-out_unlock:
mutex_unlock(&fh->subscribe_lock);
return ret;
@@ -279,7 +291,6 @@ int v4l2_event_unsubscribe(struct v4l2_fh *fh,
{
struct v4l2_subscribed_event *sev;
unsigned long flags;
- int i;
if (sub->type == V4L2_EVENT_ALL) {
v4l2_event_unsubscribe_all(fh);
@@ -291,14 +302,8 @@ int v4l2_event_unsubscribe(struct v4l2_fh *fh,
spin_lock_irqsave(&fh->vdev->fh_lock, flags);
sev = v4l2_event_subscribed(fh, sub->type, sub->id);
- if (sev != NULL) {
- /* Remove any pending events for this subscription */
- for (i = 0; i < sev->in_use; i++) {
- list_del(&sev->events[sev_pos(sev, i)].list);
- fh->navailable--;
- }
- list_del(&sev->list);
- }
+ if (sev != NULL)
+ __v4l2_event_unsubscribe(sev);
spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: v4l: event: Add subscription to list before calling "add" operation
Author: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Date: Mon Nov 5 09:35:44 2018 -0500
Patch ad608fbcf166 changed how events were subscribed to address an issue
elsewhere. As a side effect of that change, the "add" callback was called
before the event subscription was added to the list of subscribed events,
causing the first event queued by the add callback (and possibly other
events arriving soon afterwards) to be lost.
Fix this by adding the subscription to the list before calling the "add"
callback, and clean up afterwards if that fails.
Fixes: ad608fbcf166 ("media: v4l: event: Prevent freeing event subscriptions while accessed")
Reported-by: Dave Stevenson <dave.stevenson(a)raspberrypi.org>
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Tested-by: Dave Stevenson <dave.stevenson(a)raspberrypi.org>
Reviewed-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Tested-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Cc: stable(a)vger.kernel.org (for 4.14 and up)
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung(a)kernel.org>
drivers/media/v4l2-core/v4l2-event.c | 43 ++++++++++++++++++++----------------
1 file changed, 24 insertions(+), 19 deletions(-)
---
diff --git a/drivers/media/v4l2-core/v4l2-event.c b/drivers/media/v4l2-core/v4l2-event.c
index a3ef1f50a4b3..481e3c65cf97 100644
--- a/drivers/media/v4l2-core/v4l2-event.c
+++ b/drivers/media/v4l2-core/v4l2-event.c
@@ -193,6 +193,22 @@ int v4l2_event_pending(struct v4l2_fh *fh)
}
EXPORT_SYMBOL_GPL(v4l2_event_pending);
+static void __v4l2_event_unsubscribe(struct v4l2_subscribed_event *sev)
+{
+ struct v4l2_fh *fh = sev->fh;
+ unsigned int i;
+
+ lockdep_assert_held(&fh->subscribe_lock);
+ assert_spin_locked(&fh->vdev->fh_lock);
+
+ /* Remove any pending events for this subscription */
+ for (i = 0; i < sev->in_use; i++) {
+ list_del(&sev->events[sev_pos(sev, i)].list);
+ fh->navailable--;
+ }
+ list_del(&sev->list);
+}
+
int v4l2_event_subscribe(struct v4l2_fh *fh,
const struct v4l2_event_subscription *sub, unsigned elems,
const struct v4l2_subscribed_event_ops *ops)
@@ -224,27 +240,23 @@ int v4l2_event_subscribe(struct v4l2_fh *fh,
spin_lock_irqsave(&fh->vdev->fh_lock, flags);
found_ev = v4l2_event_subscribed(fh, sub->type, sub->id);
+ if (!found_ev)
+ list_add(&sev->list, &fh->subscribed);
spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
if (found_ev) {
/* Already listening */
kvfree(sev);
- goto out_unlock;
- }
-
- if (sev->ops && sev->ops->add) {
+ } else if (sev->ops && sev->ops->add) {
ret = sev->ops->add(sev, elems);
if (ret) {
+ spin_lock_irqsave(&fh->vdev->fh_lock, flags);
+ __v4l2_event_unsubscribe(sev);
+ spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
kvfree(sev);
- goto out_unlock;
}
}
- spin_lock_irqsave(&fh->vdev->fh_lock, flags);
- list_add(&sev->list, &fh->subscribed);
- spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
-
-out_unlock:
mutex_unlock(&fh->subscribe_lock);
return ret;
@@ -279,7 +291,6 @@ int v4l2_event_unsubscribe(struct v4l2_fh *fh,
{
struct v4l2_subscribed_event *sev;
unsigned long flags;
- int i;
if (sub->type == V4L2_EVENT_ALL) {
v4l2_event_unsubscribe_all(fh);
@@ -291,14 +302,8 @@ int v4l2_event_unsubscribe(struct v4l2_fh *fh,
spin_lock_irqsave(&fh->vdev->fh_lock, flags);
sev = v4l2_event_subscribed(fh, sub->type, sub->id);
- if (sev != NULL) {
- /* Remove any pending events for this subscription */
- for (i = 0; i < sev->in_use; i++) {
- list_del(&sev->events[sev_pos(sev, i)].list);
- fh->navailable--;
- }
- list_del(&sev->list);
- }
+ if (sev != NULL)
+ __v4l2_event_unsubscribe(sev);
spin_unlock_irqrestore(&fh->vdev->fh_lock, flags);
From: Thomas Richter <tmricht(a)linux.ibm.com>
On s390 the CPU Measurement Facility for counters now supports
2 PMUs named cpum_cf (CPU Measurement Facility for counters) and
cpum_cf_diag (CPU Measurement Facility for diagnostic counters)
for one and the same CPU.
Running command
[root@s35lp76 perf]# ./perf stat -e tx_c_tend \
-- ~/mytests/cf-tx-events 1
Measuring transactions
TX_C_TABORT_NO_SPECIAL: 0 expected:0
TX_C_TABORT_SPECIAL: 0 expected:0
TX_C_TEND: 1 expected:1
TX_NC_TABORT: 11 expected:11
TX_NC_TEND: 1 expected:1
Performance counter stats for '/root/mytests/cf-tx-events 1':
2 tx_c_tend
0.002120091 seconds time elapsed
0.000121000 seconds user
0.002127000 seconds sys
[root@s35lp76 perf]#
displays output which is unexpected (and wrong):
2 tx_c_tend
The test program definitely triggers only one transaction, as shown
in line 'TX_C_TEND: 1 expected:1'.
This is caused by the following call sequence:
pmu_lookup() scans and installs a PMU.
+--> pmu_aliases() parses all aliases in directory
.../<pmu-name>/events/* which are file names.
+--> pmu_aliases_parse() Read each file in directory and create
an new alias entry. This is done with
+--> perf_pmu__new_alias() and
+--> __perf_pmu__new_alias() which also check for
identical alias names.
After pmu_aliases() returns, a complete list of event names
for this pmu has been created. Now function
pmu_add_cpu_aliases() is called to add the events listed in the json
| files to the alias list of the cpu.
+--> perf_pmu__find_map() Returns a pointer to the json events.
Now function pmu_add_cpu_aliases() scans through all events listed
in the JSON files for this CPU.
Each json event pmu name is compared with the current PMU being
built up and if they mismatch, the json event is added to the
current PMUs alias list.
To avoid duplicate entries the following comparison is done:
if (!is_arm_pmu_core(name)) {
pname = pe->pmu ? pe->pmu : "cpu";
if (strncmp(pname, name, strlen(pname)))
continue;
}
The culprit is the strncmp() function.
Using current s390 PMU naming, the first PMU is 'cpum_cf'
and a long list of events is added, among them 'tx_c_tend'
When the second PMU named 'cpum_cf_diag' is added, only one event
named 'CF_DIAG' is added by the pmu_aliases() function.
Now function pmu_add_cpu_aliases() is invoked for PMU 'cpum_cf_diag'.
Since the CPUID string is the same for both PMUs, json file events
for PMU named 'cpum_cf' are added to the PMU 'cpm_cf_diag'
This happens because the strncmp() actually compares:
strncmp("cpum_cf", "cpum_cf_diag", 6);
The first parameter is the pmu name taken from the event in
the json file. The second parameter is the pmu name of the PMU
currently being built.
They are different, but the length of the compare only tests the
common prefix and this returns 0(true) when it should return false.
Now all events for PMU cpum_cf are added to the alias list for pmu
cpum_cf_diag.
Later on in function parse_events_add_pmu() the event 'tx_c_end' is
searched in all available PMUs and found twice, adding it two
times to the evsel_list global variable which is the root
of all events. This results in a counter value of 2 instead
of 1.
Output with this patch:
[root@s35lp76 perf]# ./perf stat -e tx_c_tend \
-- ~/mytests/cf-tx-events 1
Measuring transactions
TX_C_TABORT_NO_SPECIAL: 0 expected:0
TX_C_TABORT_SPECIAL: 0 expected:0
TX_C_TEND: 1 expected:1
TX_NC_TABORT: 11 expected:11
TX_NC_TEND: 1 expected:1
Performance counter stats for '/root/mytests/cf-tx-events 1':
1 tx_c_tend
0.001815365 seconds time elapsed
0.000123000 seconds user
0.001756000 seconds sys
[root@s35lp76 perf]#
Signed-off-by: Thomas Richter <tmricht(a)linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner(a)linux.ibm.com>
Reviewed-by: Sebastien Boisvert <sboisvert(a)gydle.com>
Cc: Heiko Carstens <heiko.carstens(a)de.ibm.com>
Cc: Kan Liang <kan.liang(a)linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
Cc: stable(a)vger.kernel.org
Fixes: 292c34c10249 ("perf pmu: Fix core PMU alias list for X86 platform")
Link: http://lkml.kernel.org/r/20181023151616.78193-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
---
tools/perf/util/pmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 7799788f662f..7e49baad304d 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -773,7 +773,7 @@ static void pmu_add_cpu_aliases(struct list_head *head, struct perf_pmu *pmu)
if (!is_arm_pmu_core(name)) {
pname = pe->pmu ? pe->pmu : "cpu";
- if (strncmp(pname, name, strlen(pname)))
+ if (strcmp(pname, name))
continue;
}
--
2.14.4
Hi, all,
I found the following problem (attached to the end) when testing stable-4.4 with
Syzkaller. This is not an easy-to-trigger problem, so the tool does not generate
code for recurring problems.
>From the call stack, it is because the first parameter in ktime_sub is large, and
the second parameter offset is a negative number, causing the final result to
overflow into the sign bit and become a large negative number.
--------------
...
ktime_t expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
...
--------------
But I don't know how to fix this problem. The mainline code is also different from
stable-4.4, and I have not found a patch to fix this problem in the mainline
repository.
So I am a bit confused about how to fix it. Can anyone give me some advice?
Thanks.
Xiaojun.
================================================================================
UBSAN: Undefined behaviour in kernel/time/hrtimer.c:615:20
signed integer overflow:
9223372036854775807 - -495588161 cannot be represented in type 'long long int'
CPU: 0 PID: 4542 Comm: syz-executor0 Not tainted 4.4.156-514.55.6.9.x86_64+ #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
1ffff100391dbf45 ad071d3307b76e03 ffff8801c8edfab0 ffffffff81c9f586
0000000041b58ab3 ffffffff831fd4e6 ffffffff81c9f478 ffff8801c8edfad8
ffff8801c8edfa78 00000000000014a9 ad071d3307b76e03 ffffffff837fd660
Call Trace:
[<ffffffff81c9f586>] __dump_stack lib/dump_stack.c:15 [inline]
[<ffffffff81c9f586>] dump_stack+0x10e/0x1a8 lib/dump_stack.c:51
[<ffffffff81d814a6>] ubsan_epilogue+0x12/0x8f lib/ubsan.c:164
[<ffffffff81d830a1>] handle_overflow+0x23e/0x299 lib/ubsan.c:195
[<ffffffff81d83157>] __ubsan_handle_sub_overflow+0x2a/0x31 lib/ubsan.c:211
[<ffffffff813d8c33>] hrtimer_reprogram kernel/time/hrtimer.c:615 [inline]
[<ffffffff813d8c33>] hrtimer_start_range_ns+0x1083/0x1580 kernel/time/hrtimer.c:1024
[<ffffffff813fde1f>] hrtimer_start include/linux/hrtimer.h:393 [inline]
[<ffffffff813fde1f>] alarm_start+0xcf/0x130 kernel/time/alarmtimer.c:328
[<ffffffff813fed66>] alarm_timer_set+0x296/0x4a0 kernel/time/alarmtimer.c:632
[<ffffffff813e1a3e>] SYSC_timer_settime kernel/time/posix-timers.c:914 [inline]
[<ffffffff813e1a3e>] SyS_timer_settime+0x2be/0x3d0 kernel/time/posix-timers.c:885
[<ffffffff82c2fb61>] entry_SYSCALL_64_fastpath+0x1e/0x9e
================================================================================
================================================================================
UBSAN: Undefined behaviour in kernel/time/hrtimer.c:490:13
signed integer overflow:
9223372036854775807 - -495588161 cannot be represented in type 'long long int'
CPU: 0 PID: 4542 Comm: syz-executor0 Not tainted 4.4.156-514.55.6.9.x86_64+ #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
1ffff1003ed40f8b ad071d3307b76e03 ffff8801f6a07ce0 ffffffff81c9f586
0000000041b58ab3 ffffffff831fd4e6 ffffffff81c9f478 ffff8801f6a07d08
ffff8801f6a07ca8 000000000000000a ad071d3307b76e03 ffffffff837fd660
Call Trace:
<IRQ> [<ffffffff81c9f586>] __dump_stack lib/dump_stack.c:15 [inline]
<IRQ> [<ffffffff81c9f586>] dump_stack+0x10e/0x1a8 lib/dump_stack.c:51
[<ffffffff81d814a6>] ubsan_epilogue+0x12/0x8f lib/ubsan.c:164
[<ffffffff81d830a1>] handle_overflow+0x23e/0x299 lib/ubsan.c:195
[<ffffffff81d83157>] __ubsan_handle_sub_overflow+0x2a/0x31 lib/ubsan.c:211
[<ffffffff813d43ea>] __hrtimer_get_next_event+0x1da/0x2b0 kernel/time/hrtimer.c:490
[<ffffffff813d9532>] hrtimer_interrupt+0x202/0x580 kernel/time/hrtimer.c:1361
[<ffffffff8113e7ad>] local_apic_timer_interrupt+0x9d/0x150 arch/x86/kernel/apic/apic.c:901
[<ffffffff82c32ea0>] smp_apic_timer_interrupt+0x80/0xb0 arch/x86/kernel/apic/apic.c:925
[<ffffffff82c30ac5>] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:563
<EOI> [<ffffffff82c2f0fb>] ? arch_local_irq_restore arch/x86/include/asm/paravirt.h:812 [inline]
<EOI> [<ffffffff82c2f0fb>] ? __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:162 [inline]
<EOI> [<ffffffff82c2f0fb>] ? _raw_spin_unlock_irqrestore+0x3b/0x60 kernel/locking/spinlock.c:191
[<ffffffff813e1a4f>] unlock_timer include/linux/spinlock.h:362 [inline]
[<ffffffff813e1a4f>] SYSC_timer_settime kernel/time/posix-timers.c:916 [inline]
[<ffffffff813e1a4f>] SyS_timer_settime+0x2cf/0x3d0 kernel/time/posix-timers.c:885
[<ffffffff82c2fb61>] entry_SYSCALL_64_fastpath+0x1e/0x9e
================================================================================
From: Michal Hocko <mhocko(a)suse.com>
Baoquan He has noticed that 15c30bc09085 ("mm, memory_hotplug: make
has_unmovable_pages more robust") is causing memory offlining failures
on a movable node. After a further debugging it turned out that
has_unmovable_pages fails prematurely because it stumbles over off-LRU
pages. Nevertheless those pages are not on LRU because they are waiting
on the pcp LRU caches (an example of __dump_page added by a debugging
patch)
[ 560.923297] page:ffffea043f39fa80 count:1 mapcount:0 mapping:ffff880e5dce1b59 index:0x7f6eec459
[ 560.931967] flags: 0x5fffffc0080024(uptodate|active|swapbacked)
[ 560.937867] raw: 005fffffc0080024 dead000000000100 dead000000000200 ffff880e5dce1b59
[ 560.945606] raw: 00000007f6eec459 0000000000000000 00000001ffffffff ffff880e43ae8000
[ 560.953323] page dumped because: hotplug
[ 560.957238] page->mem_cgroup:ffff880e43ae8000
[ 560.961620] has_unmovable_pages: pfn:0x10fd030d, found:0x1, count:0x0
[ 560.968127] page:ffffea043f40c340 count:2 mapcount:0 mapping:ffff880e2f2d8628 index:0x0
[ 560.976104] flags: 0x5fffffc0000006(referenced|uptodate)
[ 560.981401] raw: 005fffffc0000006 dead000000000100 dead000000000200 ffff880e2f2d8628
[ 560.989119] raw: 0000000000000000 0000000000000000 00000002ffffffff ffff88010a8f5000
[ 560.996833] page dumped because: hotplug
The issue could be worked around by calling lru_add_drain_all but we can
do better than that. We know that all swap backed pages are migrateable
and the same applies for pages which do implement the migratepage
callback.
Reported-by: Baoquan He <bhe(a)redhat.com>
Fixes: 15c30bc09085 ("mm, memory_hotplug: make has_unmovable_pages more robust")
Cc: stable
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
---
Hi,
we have been discussing issue reported by Baoquan [1] mostly off-list
and he has confirmed the patch solved failures he is seeing. I believe
that has_unmovable_pages begs for a much better implementation and/or
substantial pages isolation design rethinking but let's close the bug
which can be really annoying first.
[1] http://lkml.kernel.org/r/20181101091055.GA15166@MiWiFi-R3L-srv
mm/page_alloc.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 863d46da6586..48ceda313332 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7824,8 +7824,22 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
if (__PageMovable(page))
continue;
- if (!PageLRU(page))
- found++;
+ if (PageLRU(page))
+ continue;
+
+ /*
+ * Some LRU pages might be temporarily off-LRU for all
+ * sort of different reasons - reclaim, migration,
+ * per-cpu LRU caches etc.
+ * Make sure we do not consider those pages to be unmovable.
+ */
+ if (PageSwapBacked(page))
+ continue;
+
+ if (page->mapping && page->mapping->a_ops &&
+ page->mapping->a_ops->migratepage)
+ continue;
+
/*
* If there are RECLAIMABLE pages, we need to check
* it. But now, memory offline itself doesn't call
@@ -7839,7 +7853,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
* is set to both of a memory hole page and a _used_ kernel
* page at boot.
*/
- if (found > count)
+ if (++found > count)
goto unmovable;
}
return false;
--
2.19.1
In OpenWrt Project the flash write error caused on some products.
Also the issue can be fixed by using chip_good() instead of chip_ready().
The chip_ready() just checks the value from flash memory twice.
And the chip_good() checks the value with the expected value.
Probably the issue can be fixed as checked correctly by the chip_good().
So change to use chip_good() instead of chip_ready().
Signed-off-by: Tokunori Ikegami <ikegami(a)allied-telesis.co.jp>
Signed-off-by: Hauke Mehrtens <hauke(a)hauke-m.de>
Signed-off-by: Koen Vandeputte <koen.vandeputte(a)ncentric.com>
Signed-off-by: Fabio Bettoni <fbettoni(a)gmail.com>
Co-Developed-by: Hauke Mehrtens <hauke(a)hauke-m.de>
Co-Developed-by: Koen Vandeputte <koen.vandeputte(a)ncentric.com>
Co-Developed-by: Fabio Bettoni <fbettoni(a)gmail.com>
Reported-by: Fabio Bettoni <fbettoni(a)gmail.com>
Cc: Chris Packham <chris.packham(a)alliedtelesis.co.nz>
Cc: Joakim Tjernlund <Joakim.Tjernlund(a)infinera.com>
Cc: Boris Brezillon <boris.brezillon(a)free-electrons.com>
Cc: linux-mtd(a)lists.infradead.org
Cc: stable(a)vger.kernel.org
---
Changes since v2:
- Just update the commit message for the comment.
Changes since v1:
- Just update the commit message.
Background:
This is required for OpenWrt Project to result the flash write issue as
below patche.
<https://git.openwrt.org/?p=openwrt/openwrt.git;a=commitdiff;h=ddc11c3932c7b…>
Also the original patch in OpenWRT is below.
<https://github.com/openwrt/openwrt/blob/v18.06.0/target/linux/ar71xx/patche…>
The reason to use chip_good() is that just actually fix the issue.
And also in the past I had fixed the erase function also as same way by the
patch below.
<https://patchwork.ozlabs.org/patch/922656/>
Note: The reason for the patch for erase is same.
In my understanding the chip_ready() is just checked the value twice from
flash.
So I think that sometimes incorrect value is read twice and it is depended
on the flash device behavior but not sure..
So change to use chip_good() instead of chip_ready().
drivers/mtd/chips/cfi_cmdset_0002.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c b/drivers/mtd/chips/cfi_cmdset_0002.c
index 72428b6bfc47..251c9e1675bd 100644
--- a/drivers/mtd/chips/cfi_cmdset_0002.c
+++ b/drivers/mtd/chips/cfi_cmdset_0002.c
@@ -1627,31 +1627,37 @@ static int __xipram do_write_oneword(struct map_info *map, struct flchip *chip,
continue;
}
- if (time_after(jiffies, timeo) && !chip_ready(map, adr)){
+ if (chip_good(map, adr, datum))
+ break;
+
+ if (time_after(jiffies, timeo)){
xip_enable(map, chip, adr);
printk(KERN_WARNING "MTD %s(): software timeout\n", __func__);
xip_disable(map, chip, adr);
+ ret = -EIO;
break;
}
- if (chip_ready(map, adr))
- break;
-
/* Latency issues. Drop the lock, wait a while and retry */
UDELAY(map, chip, adr, 1);
}
+
/* Did we succeed? */
- if (!chip_good(map, adr, datum)) {
+ if (ret) {
/* reset on all failures. */
map_write(map, CMD(0xF0), chip->start);
/* FIXME - should have reset delay before continuing */
- if (++retry_cnt <= MAX_RETRIES)
+ if (++retry_cnt <= MAX_RETRIES) {
+ ret = 0;
goto retry;
+ }
ret = -EIO;
}
+
xip_enable(map, chip, adr);
+
op_done:
if (mode == FL_OTP_WRITE)
otp_exit(map, chip, adr, map_bankwidth(map));
--
2.18.0
commit a725356b6659469d182d662f22d770d83d3bc7b5 upstream.
Commit 031a072a0b8a ("vfs: call vfs_clone_file_range() under freeze
protection") created a wrapper do_clone_file_range() around
vfs_clone_file_range() moving the freeze protection to former, so
overlayfs could call the latter.
The more common vfs practice is to call do_xxx helpers from vfs_xxx
helpers, where freeze protecction is taken in the vfs_xxx helper, so
this anomality could be a source of confusion.
It seems that commit 8ede205541ff ("ovl: add reflink/copyfile/dedup
support") may have fallen a victim to this confusion -
ovl_clone_file_range() calls the vfs_clone_file_range() helper in the
hope of getting freeze protection on upper fs, but in fact results in
overlayfs allowing to bypass upper fs freeze protection.
Swap the names of the two helpers to conform to common vfs practice
and call the correct helpers from overlayfs and nfsd.
Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi(a)redhat.com>
Fixes: 031a072a0b8a ("vfs: call vfs_clone_file_range() under freeze...")
Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
---
Greg,
The upstream commit (-rc8) is not a bug fix, it's a vfs API fix.
If we do not apply it to stable kernels, we are going to have a
backporting landmine, should a future fix use vfs_clone_file_range()
it will not do the same thing upstream and in stable.
I backported the change to linux-4.18.y and added the Fixes label.
I verified that there are no pathces between the Fixes commit and
current master that use {vfs,do}_clone_file_range() and could be
considered for backporting to stable (all thoses that can be considered
are already in stable).
I verified that that with this backport applies to v4.18.16 there are no
regression of quick clone tests in xfstests with overlayfs and with xfs.
Backport patch also applies cleanly to v4.14.78.
Thanks,
Amir.
fs/ioctl.c | 2 +-
fs/nfsd/vfs.c | 3 ++-
fs/overlayfs/copy_up.c | 2 +-
fs/read_write.c | 17 +++++++++++++++--
include/linux/fs.h | 17 +++--------------
5 files changed, 22 insertions(+), 19 deletions(-)
diff --git a/fs/ioctl.c b/fs/ioctl.c
index b445b13fc59b..5444fec607ce 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -229,7 +229,7 @@ static long ioctl_file_clone(struct file *dst_file, unsigned long srcfd,
ret = -EXDEV;
if (src_file.file->f_path.mnt != dst_file->f_path.mnt)
goto fdput;
- ret = do_clone_file_range(src_file.file, off, dst_file, destoff, olen);
+ ret = vfs_clone_file_range(src_file.file, off, dst_file, destoff, olen);
fdput:
fdput(src_file);
return ret;
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index b0555d7d8200..613d2fe2dddd 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -541,7 +541,8 @@ __be32 nfsd4_set_nfs4_label(struct svc_rqst *rqstp, struct svc_fh *fhp,
__be32 nfsd4_clone_file_range(struct file *src, u64 src_pos, struct file *dst,
u64 dst_pos, u64 count)
{
- return nfserrno(do_clone_file_range(src, src_pos, dst, dst_pos, count));
+ return nfserrno(vfs_clone_file_range(src, src_pos, dst, dst_pos,
+ count));
}
ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file *dst,
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index ddaddb4ce4c3..26b477f2538d 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -156,7 +156,7 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
}
/* Try to use clone_file_range to clone up within the same fs */
- error = vfs_clone_file_range(old_file, 0, new_file, 0, len);
+ error = do_clone_file_range(old_file, 0, new_file, 0, len);
if (!error)
goto out;
/* Couldn't clone, so now we try to copy the data */
diff --git a/fs/read_write.c b/fs/read_write.c
index 153f8f690490..c9d489684335 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1818,8 +1818,8 @@ int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in,
}
EXPORT_SYMBOL(vfs_clone_file_prep_inodes);
-int vfs_clone_file_range(struct file *file_in, loff_t pos_in,
- struct file *file_out, loff_t pos_out, u64 len)
+int do_clone_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out, u64 len)
{
struct inode *inode_in = file_inode(file_in);
struct inode *inode_out = file_inode(file_out);
@@ -1866,6 +1866,19 @@ int vfs_clone_file_range(struct file *file_in, loff_t pos_in,
return ret;
}
+EXPORT_SYMBOL(do_clone_file_range);
+
+int vfs_clone_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out, u64 len)
+{
+ int ret;
+
+ file_start_write(file_out);
+ ret = do_clone_file_range(file_in, pos_in, file_out, pos_out, len);
+ file_end_write(file_out);
+
+ return ret;
+}
EXPORT_SYMBOL(vfs_clone_file_range);
/*
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a3afa50bb79f..e73363bd8646 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1813,8 +1813,10 @@ extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
extern int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in,
struct inode *inode_out, loff_t pos_out,
u64 *len, bool is_dedupe);
+extern int do_clone_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out, u64 len);
extern int vfs_clone_file_range(struct file *file_in, loff_t pos_in,
- struct file *file_out, loff_t pos_out, u64 len);
+ struct file *file_out, loff_t pos_out, u64 len);
extern int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
struct inode *dest, loff_t destoff,
loff_t len, bool *is_same);
@@ -2755,19 +2757,6 @@ static inline void file_end_write(struct file *file)
__sb_end_write(file_inode(file)->i_sb, SB_FREEZE_WRITE);
}
-static inline int do_clone_file_range(struct file *file_in, loff_t pos_in,
- struct file *file_out, loff_t pos_out,
- u64 len)
-{
- int ret;
-
- file_start_write(file_out);
- ret = vfs_clone_file_range(file_in, pos_in, file_out, pos_out, len);
- file_end_write(file_out);
-
- return ret;
-}
-
/*
* get_write_access() gets write permission for a file.
* put_write_access() releases this write permission.
--
2.7.4
commit 0962590e553331db2cc0aef2dc35c57f6300dbbe upstream.
ALU operations on pointers such as scalar_reg += map_value_ptr are
handled in adjust_ptr_min_max_vals(). Problem is however that map_ptr
and range in the register state share a union, so transferring state
through dst_reg->range = ptr_reg->range is just buggy as any new
map_ptr in the dst_reg is then truncated (or null) for subsequent
checks. Fix this by adding a raw member and use it for copying state
over to dst_reg.
Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Cc: Edward Cree <ecree(a)solarflare.com>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Acked-by: Edward Cree <ecree(a)solarflare.com>
---
include/linux/bpf_verifier.h | 3 +++
kernel/bpf/verifier.c | 10 ++++++----
2 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 38b04f5..1fd6fa8 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -50,6 +50,9 @@ struct bpf_reg_state {
* PTR_TO_MAP_VALUE_OR_NULL
*/
struct bpf_map *map_ptr;
+
+ /* Max size from any of the above. */
+ unsigned long raw;
};
/* Fixed part of pointer offset, pointer types only */
s32 off;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 82e8ede..b000686 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2731,7 +2731,7 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
dst_reg->umax_value = umax_ptr;
dst_reg->var_off = ptr_reg->var_off;
dst_reg->off = ptr_reg->off + smin_val;
- dst_reg->range = ptr_reg->range;
+ dst_reg->raw = ptr_reg->raw;
break;
}
/* A new variable offset is created. Note that off_reg->off
@@ -2761,10 +2761,11 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
}
dst_reg->var_off = tnum_add(ptr_reg->var_off, off_reg->var_off);
dst_reg->off = ptr_reg->off;
+ dst_reg->raw = ptr_reg->raw;
if (reg_is_pkt_pointer(ptr_reg)) {
dst_reg->id = ++env->id_gen;
/* something was added to pkt_ptr, set range to zero */
- dst_reg->range = 0;
+ dst_reg->raw = 0;
}
break;
case BPF_SUB:
@@ -2793,7 +2794,7 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
dst_reg->var_off = ptr_reg->var_off;
dst_reg->id = ptr_reg->id;
dst_reg->off = ptr_reg->off - smin_val;
- dst_reg->range = ptr_reg->range;
+ dst_reg->raw = ptr_reg->raw;
break;
}
/* A new variable offset is created. If the subtrahend is known
@@ -2819,11 +2820,12 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
}
dst_reg->var_off = tnum_sub(ptr_reg->var_off, off_reg->var_off);
dst_reg->off = ptr_reg->off;
+ dst_reg->raw = ptr_reg->raw;
if (reg_is_pkt_pointer(ptr_reg)) {
dst_reg->id = ++env->id_gen;
/* something was added to pkt_ptr, set range to zero */
if (smin_val < 0)
- dst_reg->range = 0;
+ dst_reg->raw = 0;
}
break;
case BPF_AND:
--
2.9.5
This patch series contains backports for 4.14 of two patches that were
submitted to stable, but failed to apply. One patch adds support for
dynamic interface configuration on the Quectel EP06, while the other
contains an upstream change triggered by the EP06-change.
The reason the patches failed to apply, is that option_probe() in option.c
has been changed upstream. A slight reshuffling of the changes in "USB:
serial: option: improve Quectel EP06 detection" was required. "USB: serial:
option: add two-endpoints device-id flag" applied cleanly after the change,
but a small change was still needed. The upstream commit removes a variable
that is still in use in 4.14, so this change is removed.
Signed-off-by: Kristian Evensen <kristian.evensen(a)gmail.com>
Johan Hovold (1):
USB: serial: option: add two-endpoints device-id flag
Kristian Evensen (1):
USB: serial: option: improve Quectel EP06 detection
drivers/usb/serial/option.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
--
2.14.1
The patch titled
Subject: mm/swapfile.c: use kvzalloc for swap_info_struct allocation
has been added to the -mm tree. Its filename is
mm-use-kvzalloc-for-swap_info_struct-allocation.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-use-kvzalloc-for-swap_info_stru…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-use-kvzalloc-for-swap_info_stru…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Vasily Averin <vvs(a)virtuozzo.com>
Subject: mm/swapfile.c: use kvzalloc for swap_info_struct allocation
a2468cc9bfdf ("swap: choose swap device according to numa node") changed
'avail_lists' field of 'struct swap_info_struct' to an array. In popular
linux distros it increased size of swap_info_struct up to 40 Kbytes and
now swap_info_struct allocation requires order-4 page. Switch to
kvzmalloc allows to avoid unexpected allocation failures.
Link: http://lkml.kernel.org/r/fc23172d-3c75-21e2-d551-8b1808cbe593@virtuozzo.com
Fixes: a2468cc9bfdf ("swap: choose swap device according to numa node")
Signed-off-by: Vasily Averin <vvs(a)virtuozzo.com>
Acked-by: Aaron Lu <aaron.lu(a)intel.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Huang Ying <ying.huang(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/swapfile.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/swapfile.c~mm-use-kvzalloc-for-swap_info_struct-allocation
+++ a/mm/swapfile.c
@@ -2813,7 +2813,7 @@ static struct swap_info_struct *alloc_sw
unsigned int type;
int i;
- p = kzalloc(sizeof(*p), GFP_KERNEL);
+ p = kvzalloc(sizeof(*p), GFP_KERNEL);
if (!p)
return ERR_PTR(-ENOMEM);
@@ -2824,7 +2824,7 @@ static struct swap_info_struct *alloc_sw
}
if (type >= MAX_SWAPFILES) {
spin_unlock(&swap_lock);
- kfree(p);
+ kvfree(p);
return ERR_PTR(-EPERM);
}
if (type >= nr_swapfiles) {
@@ -2838,7 +2838,7 @@ static struct swap_info_struct *alloc_sw
smp_wmb();
nr_swapfiles++;
} else {
- kfree(p);
+ kvfree(p);
p = swap_info[type];
/*
* Do not memset this entry: a racing procfs swap_next()
_
Patches currently in -mm which might be from vvs(a)virtuozzo.com are
mm-use-kvzalloc-for-swap_info_struct-allocation.patch
Please cherry-pick the following commit to 4.14 and 4.18:
commit a2b3bf4846e5eed62ea6abb096af2c950961033c
Author: Alan Chiang <alanx.chiang(a)intel.com>
Date: Wed Jul 25 11:20:22 2018 +0800
eeprom: at24: Add support for address-width property
Provide a flexible way to determine the addressing bits of eeprom.
Pass the addressing bits to driver through address-width property.
Signed-off-by: Alan Chiang <alanx.chiang(a)intel.com>
Signed-off-by: Andy Yeh <andy.yeh(a)intel.com>
Signed-off-by: Bartosz Golaszewski <brgl(a)bgdev.pl>
Confirmed to work on 4.14 with the Identification Page
of an ST M24M02-DR (256 bytes but 16 bit addressing).
Cannot be cherry-picked trivially on 4.9.
The corresponding documentation commit 21d04054501fb27b56e995b54ac74e39aee79a46
can be cherry-picked to 4.18, the backport for 4.14 is below.
Thanks
Adrian
>From 2562e333f39b8077ffb06bdf79430f10b74c11f5 Mon Sep 17 00:00:00 2001
From: Alan Chiang <alanx.chiang(a)intel.com>
Date: Wed, 25 Jul 2018 11:20:21 +0800
Subject: [PATCH] dt-bindings: at24: Add address-width property
Currently the only way to use a variant of a supported model with
a different address width is to define a new compatible string and
the corresponding chip data structure.
Provide a flexible way to specify the size of the address pointer
by defining a new property: address-width.
Signed-off-by: Alan Chiang <alanx.chiang(a)intel.com>
Signed-off-by: Andy Yeh <andy.yeh(a)intel.com>
Acked-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Reviewed-by: Rob Herring <robh(a)kernel.org>
[Bartosz: fixed the commit message]
Signed-off-by: Bartosz Golaszewski <brgl(a)bgdev.pl>
[Adrian Bunk: backported to 4.14]
Signed-off-by: Adrian Bunk <bunk(a)kernel.org>
---
Documentation/devicetree/bindings/eeprom/eeprom.txt | 2 ++
1 file changed, 2 insertions(+)
diff --git a/Documentation/devicetree/bindings/eeprom/eeprom.txt b/Documentation/devicetree/bindings/eeprom/eeprom.txt
index afc04589eadf..44bfffc43bed 100644
--- a/Documentation/devicetree/bindings/eeprom/eeprom.txt
+++ b/Documentation/devicetree/bindings/eeprom/eeprom.txt
@@ -36,6 +36,8 @@ Optional properties:
- read-only: this parameterless property disables writes to the eeprom
+ - address-width: number of address bits (one of 8, 16).
+
Example:
eeprom@52 {
--
2.11.0
Hello,
Please picked up this patch for linux 4.4 and 4.9.
Compiled/tested without problem.
[ Upstream commit d312fefea8387503375f728855c9a62de20c9665 ]
From: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Date: Mon, 2 Oct 2017 19:31:24 +0100
Subject: [PATCH] ahci: don't ignore result code of ahci_reset_controller()
ahci_pci_reset_controller() calls ahci_reset_controller(), which may
fail, but ignores the result code and always returns success. This
may result in failures like below
ahci 0000:02:00.0: version 3.0
ahci 0000:02:00.0: enabling device (0000 -> 0003)
ahci 0000:02:00.0: SSS flag set, parallel bus scan disabled
ahci 0000:02:00.0: controller reset failed (0xffffffff)
ahci 0000:02:00.0: failed to stop engine (-5)
... repeated many times ...
ahci 0000:02:00.0: failed to stop engine (-5)
Unable to handle kernel paging request at virtual address ffff0000093f9018
...
PC is at ahci_stop_engine+0x5c/0xd8 [libahci]
LR is at ahci_deinit_port.constprop.12+0x1c/0xc0 [libahci]
...
[<ffff000000a17014>] ahci_stop_engine+0x5c/0xd8 [libahci]
[<ffff000000a196b4>] ahci_deinit_port.constprop.12+0x1c/0xc0 [libahci]
[<ffff000000a197d8>] ahci_init_controller+0x80/0x168 [libahci]
[<ffff000000a260f8>] ahci_pci_init_controller+0x60/0x68 [ahci]
[<ffff000000a26f94>] ahci_init_one+0x75c/0xd88 [ahci]
[<ffff000008430324>] local_pci_probe+0x3c/0xb8
[<ffff000008431728>] pci_device_probe+0x138/0x170
[<ffff000008585e54>] driver_probe_device+0x2dc/0x458
[<ffff0000085860e4>] __driver_attach+0x114/0x118
[<ffff000008583ca8>] bus_for_each_dev+0x60/0xa0
[<ffff000008585638>] driver_attach+0x20/0x28
[<ffff0000085850b0>] bus_add_driver+0x1f0/0x2a8
[<ffff000008586ae0>] driver_register+0x60/0xf8
[<ffff00000842f9b4>] __pci_register_driver+0x3c/0x48
[<ffff000000a3001c>] ahci_pci_driver_init+0x1c/0x1000 [ahci]
[<ffff000008083918>] do_one_initcall+0x38/0x120
where an obvious hardware level failure results in an unnecessary 15 second
delay and a subsequent crash.
So record the result code of ahci_reset_controller() and relay it, rather
than ignoring it.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Signed-off-by: Tejun Heo <tj(a)kernel.org>
---
drivers/ata/ahci.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index cb9b0e9090e3b..9f78bb03bb763 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -621,8 +621,11 @@ static void ahci_pci_save_initial_config(struct pci_dev *pdev,
static int ahci_pci_reset_controller(struct ata_host *host)
{
struct pci_dev *pdev = to_pci_dev(host->dev);
+ int rc;
- ahci_reset_controller(host);
+ rc = ahci_reset_controller(host);
+ if (rc)
+ return rc;
if (pdev->vendor == PCI_VENDOR_ID_INTEL) {
struct ahci_host_priv *hpriv = host->private_data;