Linux expects that if a CPU modifies a memory location, then that
modification will eventually become visible to other CPUs in the system.
On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
prioritised over stores so it is possible for a store operation to be
postponed if a polling loop immediately follows it. If the variable
being polled indirectly depends on the outstanding store [for example,
another CPU may be polling the variable that is pending modification]
then there is the potential for deadlock if interrupts are disabled.
This deadlock occurs in qspinlock code.
This patch changes the definition of cpu_relax() to smp_mb() for
Loongson-3, forcing a flushing of the SFB on SMP systems before the
next load takes place. If the Kernel is not compiled for SMP support,
this will expand to a barrier() as before.
References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() for ARM11MPCore)
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
---
arch/mips/include/asm/processor.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
index af34afb..a8c4a3a 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
#define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
#define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
+#ifdef CONFIG_CPU_LOONGSON3
+/*
+ * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read
+ * loop. Since spin loops of any kind should have a cpu_relax() in them, force
+ * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes will
+ * become available as expected.
+ */
+#define cpu_relax() smp_mb()
+#else
#define cpu_relax() barrier()
+#endif
/*
* Return_address is a replacement for __builtin_return_address(count)
--
2.7.0
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e5d54f1935722f83df7619f3978f774c2b802cd8 Mon Sep 17 00:00:00 2001
From: Lyude Paul <lyude(a)redhat.com>
Date: Thu, 12 Jul 2018 13:02:53 -0400
Subject: [PATCH] drm/nouveau/drm/nouveau: Fix runtime PM leak in
nv50_disp_atomic_commit()
A CRTC being enabled doesn't mean it's on! It doesn't even necessarily
mean it's being used. This fixes runtime PM leaks on the P50 I've got
next to me.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs(a)redhat.com>
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index 9382e99a0bc7..31b12b4f321a 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -1878,7 +1878,7 @@ nv50_disp_atomic_commit(struct drm_device *dev,
nv50_disp_atomic_commit_tail(state);
drm_for_each_crtc(crtc, dev) {
- if (crtc->state->enable) {
+ if (crtc->state->active) {
if (!drm->have_disp_power_ref) {
drm->have_disp_power_ref = true;
return 0;
Hi Greg,
Please consider this patchset, which include block/scsi multiqueue performance
enhancement and bugfix.
We've run multiple benchmark and different tests for over one week, looks
good.
These patches are also included in Oracle UEK5.
They're almost just simple cherry-pick, only 2 patches need minor adjust.
They can apply cleanly on 4.14.57.
Jens Axboe (3):
Revert "blk-mq: don't handle TAG_SHARED in restart"
blk-mq: fix issue with shared tag queue re-running
blk-mq: only run the hardware queue if IO is pending
Jianchao Wang (1):
blk-mq: put the driver tag of nxt rq before first one is requeued
Ming Lei (19):
blk-mq-sched: move actual dispatching into one helper
blk-mq: introduce .get_budget and .put_budget in blk_mq_ops
sbitmap: introduce __sbitmap_for_each_set()
blk-mq-sched: improve dispatching from sw queue
scsi: allow passing in null rq to scsi_prep_state_check()
scsi: implement .get_budget and .put_budget for blk-mq
SCSI: don't get target/host busy_count in scsi_mq_get_budget()
blk-mq: don't handle TAG_SHARED in restart
blk-mq: don't restart queue when .get_budget returns BLK_STS_RESOURCE
blk-mq: don't handle failure in .get_budget
blk-flush: don't run queue for requests bypassing flush
block: pass 'run_queue' to blk_mq_request_bypass_insert
blk-flush: use blk_mq_request_bypass_insert()
blk-mq-sched: decide how to handle flush rq via RQF_FLUSH_SEQ
blk-mq: move blk_mq_put_driver_tag*() into blk-mq.h
blk-mq: don't allocate driver tag upfront for flush rq
blk-mq: put driver tag if dispatch budget can't be got
blk-mq: quiesce queue during switching io sched and updating
nr_requests
scsi: core: run queue if SCSI device queue isn't ready and queue is
idle
block/blk-core.c | 2 +-
block/blk-flush.c | 37 +++++--
block/blk-mq-debugfs.c | 1 -
block/blk-mq-sched.c | 203 ++++++++++++++++++++++-------------
block/blk-mq.c | 278 +++++++++++++++++++++++++++---------------------
block/blk-mq.h | 58 +++++++++-
block/elevator.c | 2 +
drivers/scsi/scsi_lib.c | 53 ++++++---
include/linux/blk-mq.h | 20 +++-
include/linux/sbitmap.h | 64 ++++++++---
10 files changed, 475 insertions(+), 243 deletions(-)
--
2.7.4
Function atomic_inc_unless_negative() returns a bool to indicate
success/failure. However cxl_adapter_context_get() wrongly compares
the return value against '>=0' which will always be true. The patch
fixes this comparison to '==0' there by also fixing this compile time
warning:
drivers/misc/cxl/main.c:290 cxl_adapter_context_get()
warn: 'atomic_inc_unless_negative(&adapter->contexts_num)' is unsigned
Cc: stable(a)vger.kernel.org
Fixes: 70b565bbdb91 ("cxl: Prevent adapter reset if an active context exists")
Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Signed-off-by: Vaibhav Jain <vaibhav(a)linux.ibm.com>
---
drivers/misc/cxl/main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/misc/cxl/main.c b/drivers/misc/cxl/main.c
index c1ba0d42cbc8..e0f29b8a872d 100644
--- a/drivers/misc/cxl/main.c
+++ b/drivers/misc/cxl/main.c
@@ -287,7 +287,7 @@ int cxl_adapter_context_get(struct cxl *adapter)
int rc;
rc = atomic_inc_unless_negative(&adapter->contexts_num);
- return rc >= 0 ? 0 : -EBUSY;
+ return rc ? 0 : -EBUSY;
}
void cxl_adapter_context_put(struct cxl *adapter)
--
2.17.1
Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle")
allows the writeback rate to be faster if there is no I/O request on a
bcache device. It works well if there is only one bcache device attached
to the cache set. If there are many bcache devices attached to a cache
set, it may introduce performance regression because multiple faster
writeback threads of the idle bcache devices will compete the btree level
locks with the bcache device who have I/O requests coming.
This patch fixes the above issue by only permitting fast writebac when
all bcache devices attached on the cache set are idle. And if one of the
bcache devices has new I/O request coming, minimized all writeback
throughput immediately and let PI controller __update_writeback_rate()
to decide the upcoming writeback rate for each bcache device.
Also when all bcache devices are idle, limited wrieback rate to a small
number is wast of thoughput, especially when backing devices are slower
non-rotation devices (e.g. SATA SSD). This patch sets a max writeback
rate for each backing device if the whole cache set is idle. A faster
writeback rate in idle time means new I/Os may have more available space
for dirty data, and people may observe a better write performance then.
Please note bcache may change its cache mode in run time, and this patch
still works if the cache mode is switched from writeback mode and there
is still dirty data on cache.
Fixes: Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle")
Cc: stable(a)vger.kernel.org #4.16+
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Michael Lyle <mlyle(a)lyle.org>
---
drivers/md/bcache/bcache.h | 9 +---
drivers/md/bcache/request.c | 42 ++++++++++++++-
drivers/md/bcache/sysfs.c | 14 +++--
drivers/md/bcache/util.c | 2 +-
drivers/md/bcache/util.h | 2 +-
drivers/md/bcache/writeback.c | 98 +++++++++++++++++++++++++----------
6 files changed, 126 insertions(+), 41 deletions(-)
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index d6bf294f3907..f7451e8be03c 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -328,13 +328,6 @@ struct cached_dev {
*/
atomic_t has_dirty;
- /*
- * Set to zero by things that touch the backing volume-- except
- * writeback. Incremented by writeback. Used to determine when to
- * accelerate idle writeback.
- */
- atomic_t backing_idle;
-
struct bch_ratelimit writeback_rate;
struct delayed_work writeback_rate_update;
@@ -514,6 +507,8 @@ struct cache_set {
struct cache_accounting accounting;
unsigned long flags;
+ atomic_t idle_counter;
+ atomic_t at_max_writeback_rate;
struct cache_sb sb;
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index ae67f5fa8047..fe45f561a054 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -1104,6 +1104,34 @@ static void detached_dev_do_request(struct bcache_device *d, struct bio *bio)
/* Cached devices - read & write stuff */
+static void quit_max_writeback_rate(struct cache_set *c)
+{
+ int i;
+ struct bcache_device *d;
+ struct cached_dev *dc;
+
+ mutex_lock(&bch_register_lock);
+
+ for (i = 0; i < c->devices_max_used; i++) {
+ if (!c->devices[i])
+ continue;
+
+ if (UUID_FLASH_ONLY(&c->uuids[i]))
+ continue;
+
+ d = c->devices[i];
+ dc = container_of(d, struct cached_dev, disk);
+ /*
+ * set writeback rate to default minimum value,
+ * then let update_writeback_rate() to decide the
+ * upcoming rate.
+ */
+ atomic64_set(&dc->writeback_rate.rate, 1);
+ }
+
+ mutex_unlock(&bch_register_lock);
+}
+
static blk_qc_t cached_dev_make_request(struct request_queue *q,
struct bio *bio)
{
@@ -1119,7 +1147,19 @@ static blk_qc_t cached_dev_make_request(struct request_queue *q,
return BLK_QC_T_NONE;
}
- atomic_set(&dc->backing_idle, 0);
+ if (d->c) {
+ atomic_set(&d->c->idle_counter, 0);
+ /*
+ * If at_max_writeback_rate of cache set is true and new I/O
+ * comes, quit max writeback rate of all cached devices
+ * attached to this cache set, and set at_max_writeback_rate
+ * to false.
+ */
+ if (unlikely(atomic_read(&d->c->at_max_writeback_rate) == 1)) {
+ atomic_set(&d->c->at_max_writeback_rate, 0);
+ quit_max_writeback_rate(d->c);
+ }
+ }
generic_start_io_acct(q, rw, bio_sectors(bio), &d->disk->part0);
bio_set_dev(bio, dc->bdev);
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 225b15aa0340..d719021bff81 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -170,7 +170,8 @@ SHOW(__bch_cached_dev)
var_printf(writeback_running, "%i");
var_print(writeback_delay);
var_print(writeback_percent);
- sysfs_hprint(writeback_rate, dc->writeback_rate.rate << 9);
+ sysfs_hprint(writeback_rate,
+ atomic64_read(&dc->writeback_rate.rate) << 9);
sysfs_hprint(io_errors, atomic_read(&dc->io_errors));
sysfs_printf(io_error_limit, "%i", dc->error_limit);
sysfs_printf(io_disable, "%i", dc->io_disable);
@@ -188,7 +189,8 @@ SHOW(__bch_cached_dev)
char change[20];
s64 next_io;
- bch_hprint(rate, dc->writeback_rate.rate << 9);
+ bch_hprint(rate,
+ atomic64_read(&dc->writeback_rate.rate) << 9);
bch_hprint(dirty, bcache_dev_sectors_dirty(&dc->disk) << 9);
bch_hprint(target, dc->writeback_rate_target << 9);
bch_hprint(proportional,dc->writeback_rate_proportional << 9);
@@ -255,8 +257,12 @@ STORE(__cached_dev)
sysfs_strtoul_clamp(writeback_percent, dc->writeback_percent, 0, 40);
- sysfs_strtoul_clamp(writeback_rate,
- dc->writeback_rate.rate, 1, INT_MAX);
+ if (attr == &sysfs_writeback_rate) {
+ int v;
+
+ sysfs_strtoul_clamp(writeback_rate, v, 1, INT_MAX);
+ atomic64_set(&dc->writeback_rate.rate, v);
+ }
sysfs_strtoul_clamp(writeback_rate_update_seconds,
dc->writeback_rate_update_seconds,
diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
index fc479b026d6d..84f90c3d996d 100644
--- a/drivers/md/bcache/util.c
+++ b/drivers/md/bcache/util.c
@@ -200,7 +200,7 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done)
{
uint64_t now = local_clock();
- d->next += div_u64(done * NSEC_PER_SEC, d->rate);
+ d->next += div_u64(done * NSEC_PER_SEC, atomic64_read(&d->rate));
/* Bound the time. Don't let us fall further than 2 seconds behind
* (this prevents unnecessary backlog that would make it impossible
diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h
index cced87f8eb27..7e17f32ab563 100644
--- a/drivers/md/bcache/util.h
+++ b/drivers/md/bcache/util.h
@@ -442,7 +442,7 @@ struct bch_ratelimit {
* Rate at which we want to do work, in units per second
* The units here correspond to the units passed to bch_next_delay()
*/
- uint32_t rate;
+ atomic64_t rate;
};
static inline void bch_ratelimit_reset(struct bch_ratelimit *d)
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index ad45ebe1a74b..72059f910230 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -49,6 +49,63 @@ static uint64_t __calc_target_rate(struct cached_dev *dc)
return (cache_dirty_target * bdev_share) >> WRITEBACK_SHARE_SHIFT;
}
+static bool set_at_max_writeback_rate(struct cache_set *c,
+ struct cached_dev *dc)
+{
+ int i, dirty_dc_nr = 0;
+ struct bcache_device *d;
+
+ mutex_lock(&bch_register_lock);
+ for (i = 0; i < c->devices_max_used; i++) {
+ if (!c->devices[i])
+ continue;
+ if (UUID_FLASH_ONLY(&c->uuids[i]))
+ continue;
+ d = c->devices[i];
+ dc = container_of(d, struct cached_dev, disk);
+ if (atomic_read(&dc->has_dirty))
+ dirty_dc_nr++;
+ }
+ mutex_unlock(&bch_register_lock);
+
+ /*
+ * Idle_counter is increased everytime when update_writeback_rate()
+ * is rescheduled in. If all backing devices attached to the same
+ * cache set has same dc->writeback_rate_update_seconds value, it
+ * is about 10 rounds of update_writeback_rate() is called on each
+ * backing device, then the code will fall through at set 1 to
+ * c->at_max_writeback_rate, and a max wrteback rate to each
+ * dc->writeback_rate.rate. This is not very accurate but works well
+ * to make sure the whole cache set has no new I/O coming before
+ * writeback rate is set to a max number.
+ */
+ if (atomic_inc_return(&c->idle_counter) < dirty_dc_nr * 10)
+ return false;
+
+ if (atomic_read(&c->at_max_writeback_rate) != 1)
+ atomic_set(&c->at_max_writeback_rate, 1);
+
+
+ atomic64_set(&dc->writeback_rate.rate, INT_MAX);
+
+ /* keep writeback_rate_target as existing value */
+ dc->writeback_rate_proportional = 0;
+ dc->writeback_rate_integral_scaled = 0;
+ dc->writeback_rate_change = 0;
+
+ /*
+ * Check c->idle_counter and c->at_max_writeback_rate agagain in case
+ * new I/O arrives during before set_at_max_writeback_rate() returns.
+ * Then the writeback rate is set to 1, and its new value should be
+ * decided via __update_writeback_rate().
+ */
+ if (atomic_read(&c->idle_counter) < dirty_dc_nr * 10 ||
+ !atomic_read(&c->at_max_writeback_rate))
+ return false;
+
+ return true;
+}
+
static void __update_writeback_rate(struct cached_dev *dc)
{
/*
@@ -104,8 +161,9 @@ static void __update_writeback_rate(struct cached_dev *dc)
dc->writeback_rate_proportional = proportional_scaled;
dc->writeback_rate_integral_scaled = integral_scaled;
- dc->writeback_rate_change = new_rate - dc->writeback_rate.rate;
- dc->writeback_rate.rate = new_rate;
+ dc->writeback_rate_change = new_rate -
+ atomic64_read(&dc->writeback_rate.rate);
+ atomic64_set(&dc->writeback_rate.rate, new_rate);
dc->writeback_rate_target = target;
}
@@ -138,9 +196,16 @@ static void update_writeback_rate(struct work_struct *work)
down_read(&dc->writeback_lock);
- if (atomic_read(&dc->has_dirty) &&
- dc->writeback_percent)
- __update_writeback_rate(dc);
+ if (atomic_read(&dc->has_dirty) && dc->writeback_percent) {
+ /*
+ * If the whole cache set is idle, set_at_max_writeback_rate()
+ * will set writeback rate to a max number. Then it is
+ * unncessary to update writeback rate for an idle cache set
+ * in maximum writeback rate number(s).
+ */
+ if (!set_at_max_writeback_rate(c, dc))
+ __update_writeback_rate(dc);
+ }
up_read(&dc->writeback_lock);
@@ -422,27 +487,6 @@ static void read_dirty(struct cached_dev *dc)
delay = writeback_delay(dc, size);
- /* If the control system would wait for at least half a
- * second, and there's been no reqs hitting the backing disk
- * for awhile: use an alternate mode where we have at most
- * one contiguous set of writebacks in flight at a time. If
- * someone wants to do IO it will be quick, as it will only
- * have to contend with one operation in flight, and we'll
- * be round-tripping data to the backing disk as quickly as
- * it can accept it.
- */
- if (delay >= HZ / 2) {
- /* 3 means at least 1.5 seconds, up to 7.5 if we
- * have slowed way down.
- */
- if (atomic_inc_return(&dc->backing_idle) >= 3) {
- /* Wait for current I/Os to finish */
- closure_sync(&cl);
- /* And immediately launch a new set. */
- delay = 0;
- }
- }
-
while (!kthread_should_stop() &&
!test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
delay) {
@@ -715,7 +759,7 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc)
dc->writeback_running = true;
dc->writeback_percent = 10;
dc->writeback_delay = 30;
- dc->writeback_rate.rate = 1024;
+ atomic64_set(&dc->writeback_rate.rate, 1024);
dc->writeback_rate_minimum = 8;
dc->writeback_rate_update_seconds = WRITEBACK_RATE_UPDATE_SECS_DEFAULT;
--
2.17.1
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
There are several issues with the suspend/resume handling code of the
driver:
- The device is attached and detached in the runtime_suspend() and
runtime_resume() callbacks if the interface is running. However,
during xcan_chip_start() the interface is considered running,
causing the resume handler to incorrectly call netif_start_queue()
at the beginning of xcan_chip_start(), and on xcan_chip_start() error
return the suspend handler detaches the device leaving the user
unable to bring-up the device anymore.
- The device is not brought properly up on system resume. A reset is
done and the code tries to determine the bus state after that.
However, after reset the device is always in Configuration mode
(down), so the state checking code does not make sense and
communication will also not work.
- The suspend callback tries to set the device to sleep mode (low-power
mode which monitors the bus and brings the device back to normal mode
on activity), but then immediately disables the clocks (possibly
before the device reaches the sleep mode), which does not make sense
to me. If a clean shutdown is wanted before disabling clocks, we can
just bring it down completely instead of only sleep mode.
Reorganize the PM code so that only the clock logic remains in the
runtime PM callbacks and the system PM callbacks contain the device
bring-up/down logic. This makes calling the runtime PM callbacks during
e.g. xcan_chip_start() safe.
The system PM callbacks now simply call common code to start/stop the
HW if the interface was running, replacing the broken code from before.
xcan_chip_stop() is updated to use the common reset code so that it will
wait for the reset to complete. Reset also disables all interrupts so do
not do that separately.
Also, the device_may_wakeup() checks are removed as the driver does not
have wakeup support.
Tested on Zynq-7000 integrated CAN.
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: Michal Simek <michal.simek(a)xilinx.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 69 +++++++++++++++---------------------
1 file changed, 28 insertions(+), 41 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index cb80a9aa7281..5a24039733ef 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -984,13 +984,9 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
static void xcan_chip_stop(struct net_device *ndev)
{
struct xcan_priv *priv = netdev_priv(ndev);
- u32 ier;
/* Disable interrupts and leave the can in configuration mode */
- ier = priv->read_reg(priv, XCAN_IER_OFFSET);
- ier &= ~XCAN_INTR_ALL;
- priv->write_reg(priv, XCAN_IER_OFFSET, ier);
- priv->write_reg(priv, XCAN_SRR_OFFSET, XCAN_SRR_RESET_MASK);
+ set_reset_mode(ndev);
priv->can.state = CAN_STATE_STOPPED;
}
@@ -1123,10 +1119,15 @@ static const struct net_device_ops xcan_netdev_ops = {
*/
static int __maybe_unused xcan_suspend(struct device *dev)
{
- if (!device_may_wakeup(dev))
- return pm_runtime_force_suspend(dev);
+ struct net_device *ndev = dev_get_drvdata(dev);
- return 0;
+ if (netif_running(ndev)) {
+ netif_stop_queue(ndev);
+ netif_device_detach(ndev);
+ xcan_chip_stop(ndev);
+ }
+
+ return pm_runtime_force_suspend(dev);
}
/**
@@ -1138,11 +1139,27 @@ static int __maybe_unused xcan_suspend(struct device *dev)
*/
static int __maybe_unused xcan_resume(struct device *dev)
{
- if (!device_may_wakeup(dev))
- return pm_runtime_force_resume(dev);
+ struct net_device *ndev = dev_get_drvdata(dev);
+ int ret;
- return 0;
+ ret = pm_runtime_force_resume(dev);
+ if (ret) {
+ dev_err(dev, "pm_runtime_force_resume failed on resume\n");
+ return ret;
+ }
+
+ if (netif_running(ndev)) {
+ ret = xcan_chip_start(ndev);
+ if (ret) {
+ dev_err(dev, "xcan_chip_start failed on resume\n");
+ return ret;
+ }
+ netif_device_attach(ndev);
+ netif_start_queue(ndev);
+ }
+
+ return 0;
}
/**
@@ -1157,14 +1174,6 @@ static int __maybe_unused xcan_runtime_suspend(struct device *dev)
struct net_device *ndev = dev_get_drvdata(dev);
struct xcan_priv *priv = netdev_priv(ndev);
- if (netif_running(ndev)) {
- netif_stop_queue(ndev);
- netif_device_detach(ndev);
- }
-
- priv->write_reg(priv, XCAN_MSR_OFFSET, XCAN_MSR_SLEEP_MASK);
- priv->can.state = CAN_STATE_SLEEPING;
-
clk_disable_unprepare(priv->bus_clk);
clk_disable_unprepare(priv->can_clk);
@@ -1183,7 +1192,6 @@ static int __maybe_unused xcan_runtime_resume(struct device *dev)
struct net_device *ndev = dev_get_drvdata(dev);
struct xcan_priv *priv = netdev_priv(ndev);
int ret;
- u32 isr, status;
ret = clk_prepare_enable(priv->bus_clk);
if (ret) {
@@ -1197,27 +1205,6 @@ static int __maybe_unused xcan_runtime_resume(struct device *dev)
return ret;
}
- priv->write_reg(priv, XCAN_SRR_OFFSET, XCAN_SRR_RESET_MASK);
- isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
- status = priv->read_reg(priv, XCAN_SR_OFFSET);
-
- if (netif_running(ndev)) {
- if (isr & XCAN_IXR_BSOFF_MASK) {
- priv->can.state = CAN_STATE_BUS_OFF;
- priv->write_reg(priv, XCAN_SRR_OFFSET,
- XCAN_SRR_RESET_MASK);
- } else if ((status & XCAN_SR_ESTAT_MASK) ==
- XCAN_SR_ESTAT_MASK) {
- priv->can.state = CAN_STATE_ERROR_PASSIVE;
- } else if (status & XCAN_SR_ERRWRN_MASK) {
- priv->can.state = CAN_STATE_ERROR_WARNING;
- } else {
- priv->can.state = CAN_STATE_ERROR_ACTIVE;
- }
- netif_device_attach(ndev);
- netif_start_queue(ndev);
- }
-
return 0;
}
--
2.18.0
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
xcan_interrupt() clears ERROR|RXOFLV|BSOFF|ARBLST interrupts if any of
them is asserted. This does not take into account that some of them
could have been asserted between interrupt status read and interrupt
clear, therefore clearing them without handling them.
Fix the code to only clear those interrupts that it knows are asserted
and therefore going to be processed in xcan_err_interrupt().
Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: Michal Simek <michal.simek(a)xilinx.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index ea9f9d1a5ba7..cb80a9aa7281 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -938,6 +938,7 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
struct net_device *ndev = (struct net_device *)dev_id;
struct xcan_priv *priv = netdev_priv(ndev);
u32 isr, ier;
+ u32 isr_errors;
/* Get the interrupt status from Xilinx CAN */
isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
@@ -956,11 +957,10 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
xcan_tx_interrupt(ndev, isr);
/* Check for the type of error interrupt and Processing it */
- if (isr & (XCAN_IXR_ERROR_MASK | XCAN_IXR_RXOFLW_MASK |
- XCAN_IXR_BSOFF_MASK | XCAN_IXR_ARBLST_MASK)) {
- priv->write_reg(priv, XCAN_ICR_OFFSET, (XCAN_IXR_ERROR_MASK |
- XCAN_IXR_RXOFLW_MASK | XCAN_IXR_BSOFF_MASK |
- XCAN_IXR_ARBLST_MASK));
+ isr_errors = isr & (XCAN_IXR_ERROR_MASK | XCAN_IXR_RXOFLW_MASK |
+ XCAN_IXR_BSOFF_MASK | XCAN_IXR_ARBLST_MASK);
+ if (isr_errors) {
+ priv->write_reg(priv, XCAN_ICR_OFFSET, isr_errors);
xcan_err_interrupt(ndev, isr);
}
--
2.18.0
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
The xilinx_can driver assumes that the TXOK interrupt only clears after
it has been acknowledged as many times as there have been successfully
sent frames.
However, the documentation does not mention such behavior, instead
saying just that the interrupt is cleared when the clear bit is set.
Similarly, testing seems to also suggest that it is immediately cleared
regardless of the amount of frames having been sent. Performing some
heavy TX load and then going back to idle has the tx_head drifting
further away from tx_tail over time, steadily reducing the amount of
frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
interrupt always frees up space for 1 frame from the driver's
perspective, so frames continue to be sent) and delaying the local echo
frames.
The TX FIFO tracking is also otherwise buggy as it does not account for
TX FIFO being cleared after software resets, causing
BUG!, TX FIFO full when queue awake!
messages to be output.
There does not seem to be any way to accurately track the state of the
TX FIFO for local echo support while using the full TX FIFO.
The Zynq version of the HW (but not the soft-AXI version) has watermark
programming support and with it an additional TX-FIFO-empty interrupt
bit.
Modify the driver to only put 1 frame into TX FIFO at a time on soft-AXI
and 2 frames at a time on Zynq. On Zynq the TXFEMP interrupt bit is used
to detect whether 1 or 2 frames have been sent at interrupt processing
time.
Tested with the integrated CAN on Zynq-7000 SoC. The 1-frame-FIFO mode
was also tested.
An alternative way to solve this would be to drop local echo support but
keep using the full TX FIFO.
v2: Add FIFO space check before TX queue wake with locking to
synchronize with queue stop. This avoids waking the queue when xmit()
had just filled it.
v3: Keep local echo support and reduce the amount of frames in FIFO
instead as suggested by Marc Kleine-Budde.
Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 139 +++++++++++++++++++++++++++++++----
1 file changed, 123 insertions(+), 16 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 763408a3eafb..dcbdc3cd651c 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -26,8 +26,10 @@
#include <linux/module.h>
#include <linux/netdevice.h>
#include <linux/of.h>
+#include <linux/of_device.h>
#include <linux/platform_device.h>
#include <linux/skbuff.h>
+#include <linux/spinlock.h>
#include <linux/string.h>
#include <linux/types.h>
#include <linux/can/dev.h>
@@ -119,6 +121,7 @@ enum xcan_reg {
/**
* struct xcan_priv - This definition define CAN driver instance
* @can: CAN private data structure.
+ * @tx_lock: Lock for synchronizing TX interrupt handling
* @tx_head: Tx CAN packets ready to send on the queue
* @tx_tail: Tx CAN packets successfully sended on the queue
* @tx_max: Maximum number packets the driver can send
@@ -133,6 +136,7 @@ enum xcan_reg {
*/
struct xcan_priv {
struct can_priv can;
+ spinlock_t tx_lock;
unsigned int tx_head;
unsigned int tx_tail;
unsigned int tx_max;
@@ -160,6 +164,11 @@ static const struct can_bittiming_const xcan_bittiming_const = {
.brp_inc = 1,
};
+#define XCAN_CAP_WATERMARK 0x0001
+struct xcan_devtype_data {
+ unsigned int caps;
+};
+
/**
* xcan_write_reg_le - Write a value to the device register little endian
* @priv: Driver private data structure
@@ -239,6 +248,10 @@ static int set_reset_mode(struct net_device *ndev)
usleep_range(500, 10000);
}
+ /* reset clears FIFOs */
+ priv->tx_head = 0;
+ priv->tx_tail = 0;
+
return 0;
}
@@ -393,6 +406,7 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
struct net_device_stats *stats = &ndev->stats;
struct can_frame *cf = (struct can_frame *)skb->data;
u32 id, dlc, data[2] = {0, 0};
+ unsigned long flags;
if (can_dropped_invalid_skb(ndev, skb))
return NETDEV_TX_OK;
@@ -440,6 +454,9 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
data[1] = be32_to_cpup((__be32 *)(cf->data + 4));
can_put_echo_skb(skb, ndev, priv->tx_head % priv->tx_max);
+
+ spin_lock_irqsave(&priv->tx_lock, flags);
+
priv->tx_head++;
/* Write the Frame to Xilinx CAN TX FIFO */
@@ -455,10 +472,16 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
stats->tx_bytes += cf->can_dlc;
}
+ /* Clear TX-FIFO-empty interrupt for xcan_tx_interrupt() */
+ if (priv->tx_max > 1)
+ priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXFEMP_MASK);
+
/* Check if the TX buffer is full */
if ((priv->tx_head - priv->tx_tail) == priv->tx_max)
netif_stop_queue(ndev);
+ spin_unlock_irqrestore(&priv->tx_lock, flags);
+
return NETDEV_TX_OK;
}
@@ -832,19 +855,71 @@ static void xcan_tx_interrupt(struct net_device *ndev, u32 isr)
{
struct xcan_priv *priv = netdev_priv(ndev);
struct net_device_stats *stats = &ndev->stats;
+ unsigned int frames_in_fifo;
+ int frames_sent = 1; /* TXOK => at least 1 frame was sent */
+ unsigned long flags;
+ int retries = 0;
+
+ /* Synchronize with xmit as we need to know the exact number
+ * of frames in the FIFO to stay in sync due to the TXFEMP
+ * handling.
+ * This also prevents a race between netif_wake_queue() and
+ * netif_stop_queue().
+ */
+ spin_lock_irqsave(&priv->tx_lock, flags);
+
+ frames_in_fifo = priv->tx_head - priv->tx_tail;
- while ((priv->tx_head - priv->tx_tail > 0) &&
- (isr & XCAN_IXR_TXOK_MASK)) {
+ if (WARN_ON_ONCE(frames_in_fifo == 0)) {
+ /* clear TXOK anyway to avoid getting back here */
priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
+ spin_unlock_irqrestore(&priv->tx_lock, flags);
+ return;
+ }
+
+ /* Check if 2 frames were sent (TXOK only means that at least 1
+ * frame was sent).
+ */
+ if (frames_in_fifo > 1) {
+ WARN_ON(frames_in_fifo > priv->tx_max);
+
+ /* Synchronize TXOK and isr so that after the loop:
+ * (1) isr variable is up-to-date at least up to TXOK clear
+ * time. This avoids us clearing a TXOK of a second frame
+ * but not noticing that the FIFO is now empty and thus
+ * marking only a single frame as sent.
+ * (2) No TXOK is left. Having one could mean leaving a
+ * stray TXOK as we might process the associated frame
+ * via TXFEMP handling as we read TXFEMP *after* TXOK
+ * clear to satisfy (1).
+ */
+ while ((isr & XCAN_IXR_TXOK_MASK) && !WARN_ON(++retries == 100)) {
+ priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
+ isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
+ }
+
+ if (isr & XCAN_IXR_TXFEMP_MASK) {
+ /* nothing in FIFO anymore */
+ frames_sent = frames_in_fifo;
+ }
+ } else {
+ /* single frame in fifo, just clear TXOK */
+ priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
+ }
+
+ while (frames_sent--) {
can_get_echo_skb(ndev, priv->tx_tail %
priv->tx_max);
priv->tx_tail++;
stats->tx_packets++;
- isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
}
+
+ netif_wake_queue(ndev);
+
+ spin_unlock_irqrestore(&priv->tx_lock, flags);
+
can_led_event(ndev, CAN_LED_EVENT_TX);
xcan_update_error_state_after_rxtx(ndev);
- netif_wake_queue(ndev);
}
/**
@@ -1151,6 +1226,18 @@ static const struct dev_pm_ops xcan_dev_pm_ops = {
SET_RUNTIME_PM_OPS(xcan_runtime_suspend, xcan_runtime_resume, NULL)
};
+static const struct xcan_devtype_data xcan_zynq_data = {
+ .caps = XCAN_CAP_WATERMARK,
+};
+
+/* Match table for OF platform binding */
+static const struct of_device_id xcan_of_match[] = {
+ { .compatible = "xlnx,zynq-can-1.0", .data = &xcan_zynq_data },
+ { .compatible = "xlnx,axi-can-1.00.a", },
+ { /* end of list */ },
+};
+MODULE_DEVICE_TABLE(of, xcan_of_match);
+
/**
* xcan_probe - Platform registration call
* @pdev: Handle to the platform device structure
@@ -1165,8 +1252,10 @@ static int xcan_probe(struct platform_device *pdev)
struct resource *res; /* IO mem resources */
struct net_device *ndev;
struct xcan_priv *priv;
+ const struct of_device_id *of_id;
+ int caps = 0;
void __iomem *addr;
- int ret, rx_max, tx_max;
+ int ret, rx_max, tx_max, tx_fifo_depth;
/* Get the virtual base address for the device */
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
@@ -1176,7 +1265,8 @@ static int xcan_probe(struct platform_device *pdev)
goto err;
}
- ret = of_property_read_u32(pdev->dev.of_node, "tx-fifo-depth", &tx_max);
+ ret = of_property_read_u32(pdev->dev.of_node, "tx-fifo-depth",
+ &tx_fifo_depth);
if (ret < 0)
goto err;
@@ -1184,6 +1274,30 @@ static int xcan_probe(struct platform_device *pdev)
if (ret < 0)
goto err;
+ of_id = of_match_device(xcan_of_match, &pdev->dev);
+ if (of_id) {
+ const struct xcan_devtype_data *devtype_data = of_id->data;
+
+ if (devtype_data)
+ caps = devtype_data->caps;
+ }
+
+ /* There is no way to directly figure out how many frames have been
+ * sent when the TXOK interrupt is processed. If watermark programming
+ * is supported, we can have 2 frames in the FIFO and use TXFEMP
+ * to determine if 1 or 2 frames have been sent.
+ * Theoretically we should be able to use TXFWMEMP to determine up
+ * to 3 frames, but it seems that after putting a second frame in the
+ * FIFO, with watermark at 2 frames, it can happen that TXFWMEMP (less
+ * than 2 frames in FIFO) is set anyway with no TXOK (a frame was
+ * sent), which is not a sensible state - possibly TXFWMEMP is not
+ * completely synchronized with the rest of the bits?
+ */
+ if (caps & XCAN_CAP_WATERMARK)
+ tx_max = min(tx_fifo_depth, 2);
+ else
+ tx_max = 1;
+
/* Create a CAN device instance */
ndev = alloc_candev(sizeof(struct xcan_priv), tx_max);
if (!ndev)
@@ -1198,6 +1312,7 @@ static int xcan_probe(struct platform_device *pdev)
CAN_CTRLMODE_BERR_REPORTING;
priv->reg_base = addr;
priv->tx_max = tx_max;
+ spin_lock_init(&priv->tx_lock);
/* Get IRQ for the device */
ndev->irq = platform_get_irq(pdev, 0);
@@ -1262,9 +1377,9 @@ static int xcan_probe(struct platform_device *pdev)
pm_runtime_put(&pdev->dev);
- netdev_dbg(ndev, "reg_base=0x%p irq=%d clock=%d, tx fifo depth:%d\n",
+ netdev_dbg(ndev, "reg_base=0x%p irq=%d clock=%d, tx fifo depth: actual %d, using %d\n",
priv->reg_base, ndev->irq, priv->can.clock.freq,
- priv->tx_max);
+ tx_fifo_depth, priv->tx_max);
return 0;
@@ -1298,14 +1413,6 @@ static int xcan_remove(struct platform_device *pdev)
return 0;
}
-/* Match table for OF platform binding */
-static const struct of_device_id xcan_of_match[] = {
- { .compatible = "xlnx,zynq-can-1.0", },
- { .compatible = "xlnx,axi-can-1.00.a", },
- { /* end of list */ },
-};
-MODULE_DEVICE_TABLE(of, xcan_of_match);
-
static struct platform_driver xcan_driver = {
.probe = xcan_probe,
.remove = xcan_remove,
--
2.18.0
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
If the device gets into a state where RXNEMP (RX FIFO not empty)
interrupt is asserted without RXOK (new frame received successfully)
interrupt being asserted, xcan_rx_poll() will continue to try to clear
RXNEMP without actually reading frames from RX FIFO. If the RX FIFO is
not empty, the interrupt will not be cleared and napi_schedule() will
just be called again.
This situation can occur when:
(a) xcan_rx() returns without reading RX FIFO due to an error condition.
The code tries to clear both RXOK and RXNEMP but RXNEMP will not clear
due to a frame still being in the FIFO. The frame will never be read
from the FIFO as RXOK is no longer set.
(b) A frame is received between xcan_rx_poll() reading interrupt status
and clearing RXOK. RXOK will be cleared, but RXNEMP will again remain
set as the new message is still in the FIFO.
I'm able to trigger case (b) by flooding the bus with frames under load.
There does not seem to be any benefit in using both RXNEMP and RXOK in
the way the driver does, and the polling example in the reference manual
(UG585 v1.10 18.3.7 Read Messages from RxFIFO) also says that either
RXOK or RXNEMP can be used for detecting incoming messages.
Fix the issue and simplify the RX processing by only using RXNEMP
without RXOK.
Tested with the integrated CAN on Zynq-7000 SoC.
Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 18 +++++-------------
1 file changed, 5 insertions(+), 13 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 389a9603db8c..1bda47aa62f5 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -101,7 +101,7 @@ enum xcan_reg {
#define XCAN_INTR_ALL (XCAN_IXR_TXOK_MASK | XCAN_IXR_BSOFF_MASK |\
XCAN_IXR_WKUP_MASK | XCAN_IXR_SLP_MASK | \
XCAN_IXR_RXNEMP_MASK | XCAN_IXR_ERROR_MASK | \
- XCAN_IXR_ARBLST_MASK | XCAN_IXR_RXOK_MASK)
+ XCAN_IXR_ARBLST_MASK)
/* CAN register bit shift - XCAN_<REG>_<BIT>_SHIFT */
#define XCAN_BTR_SJW_SHIFT 7 /* Synchronous jump width */
@@ -708,15 +708,7 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
while ((isr & XCAN_IXR_RXNEMP_MASK) && (work_done < quota)) {
- if (isr & XCAN_IXR_RXOK_MASK) {
- priv->write_reg(priv, XCAN_ICR_OFFSET,
- XCAN_IXR_RXOK_MASK);
- work_done += xcan_rx(ndev);
- } else {
- priv->write_reg(priv, XCAN_ICR_OFFSET,
- XCAN_IXR_RXNEMP_MASK);
- break;
- }
+ work_done += xcan_rx(ndev);
priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_RXNEMP_MASK);
isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
}
@@ -727,7 +719,7 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
if (work_done < quota) {
napi_complete_done(napi, work_done);
ier = priv->read_reg(priv, XCAN_IER_OFFSET);
- ier |= (XCAN_IXR_RXOK_MASK | XCAN_IXR_RXNEMP_MASK);
+ ier |= XCAN_IXR_RXNEMP_MASK;
priv->write_reg(priv, XCAN_IER_OFFSET, ier);
}
return work_done;
@@ -799,9 +791,9 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
}
/* Check for the type of receive interrupt and Processing it */
- if (isr & (XCAN_IXR_RXNEMP_MASK | XCAN_IXR_RXOK_MASK)) {
+ if (isr & XCAN_IXR_RXNEMP_MASK) {
ier = priv->read_reg(priv, XCAN_IER_OFFSET);
- ier &= ~(XCAN_IXR_RXNEMP_MASK | XCAN_IXR_RXOK_MASK);
+ ier &= ~XCAN_IXR_RXNEMP_MASK;
priv->write_reg(priv, XCAN_IER_OFFSET, ier);
napi_schedule(&priv->napi);
}
--
2.18.0
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
The xilinx_can driver performs a software reset when an RX overrun is
detected. This causes the device to enter Configuration mode where no
messages are received or transmitted.
The documentation does not mention any need to perform a reset on an RX
overrun, and testing by inducing an RX overflow also indicated that the
device continues to work just fine without a reset.
Remove the software reset.
Tested with the integrated CAN on Zynq-7000 SoC.
Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 89aec07c225f..389a9603db8c 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -600,7 +600,6 @@ static void xcan_err_interrupt(struct net_device *ndev, u32 isr)
if (isr & XCAN_IXR_RXOFLW_MASK) {
stats->rx_over_errors++;
stats->rx_errors++;
- priv->write_reg(priv, XCAN_SRR_OFFSET, XCAN_SRR_RESET_MASK);
if (skb) {
cf->can_id |= CAN_ERR_CRTL;
cf->data[1] |= CAN_ERR_CRTL_RX_OVERFLOW;
--
2.18.0
From: Faiz Abbas <faiz_abbas(a)ti.com>
pm_runtime_get_sync() returns a 1 if the state of the device is already
'active'. This is not a failure case and should return a success.
Therefore fix error handling for pm_runtime_get_sync() call such that
it returns success when the value is 1.
Also cleanup the TODO for using runtime PM for sleep mode as that is
implemented.
Signed-off-by: Faiz Abbas <faiz_abbas(a)ti.com>
Cc: <stable(a)vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/m_can/m_can.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 8e2b7f873c4d..e2f965c2e3aa 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -634,10 +634,12 @@ static int m_can_clk_start(struct m_can_priv *priv)
int err;
err = pm_runtime_get_sync(priv->device);
- if (err)
+ if (err < 0) {
pm_runtime_put_noidle(priv->device);
+ return err;
+ }
- return err;
+ return 0;
}
static void m_can_clk_stop(struct m_can_priv *priv)
@@ -1688,8 +1690,6 @@ static int m_can_plat_probe(struct platform_device *pdev)
return ret;
}
-/* TODO: runtime PM with power down or sleep mode */
-
static __maybe_unused int m_can_suspend(struct device *dev)
{
struct net_device *ndev = dev_get_drvdata(dev);
--
2.18.0
From: Roman Fietze <roman.fietze(a)telemotive.de>
Inside m_can_chip_config(), when setting up the new value of the CCCR,
the CCCR_NISO bit is not cleared like the others, CCCR_TEST, CCCR_MON,
CCCR_BRSE and CCCR_FDOE, before checking the can.ctrlmode bits for
CAN_CTRLMODE_FD_NON_ISO.
This way once the controller was configured for CAN_CTRLMODE_FD_NON_ISO,
this mode could never be cleared again.
This fix is only relevant for controllers with version 3.1.x or 3.2.x.
Older versions do not support NISO.
Signed-off-by: Roman Fietze <roman.fietze(a)telemotive.de>
Cc: linux-stable <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/m_can/m_can.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index b397a33f3d32..8e2b7f873c4d 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -1109,7 +1109,8 @@ static void m_can_chip_config(struct net_device *dev)
} else {
/* Version 3.1.x or 3.2.x */
- cccr &= ~(CCCR_TEST | CCCR_MON | CCCR_BRSE | CCCR_FDOE);
+ cccr &= ~(CCCR_TEST | CCCR_MON | CCCR_BRSE | CCCR_FDOE |
+ CCCR_NISO);
/* Only 3.2.x has NISO Bit implemented */
if (priv->can.ctrlmode & CAN_CTRLMODE_FD_NON_ISO)
--
2.18.0
From: Stephane Grosjean <s.grosjean(a)peak-system.com>
The DMA logic in firmwares < v3.3.0 embedded in the PCAN-PCIe FD cards
family is not capable of handling a mix of 32-bit and 64-bit logical
addresses. If the board is equipped with 2 or 4 CAN ports, then such a
situation might lead to a PCIe Bus Error "Malformed TLP" packet
as well as "irq xx: nobody cared" issue.
This patch adds a workaround that requests only 32-bit DMA addresses
when these might be allocated outside of the 4 GB area.
This issue has been fixed in firmware v3.3.0 and next.
Signed-off-by: Stephane Grosjean <s.grosjean(a)peak-system.com>
Cc: linux-stable <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/peak_canfd/peak_pciefd_main.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/net/can/peak_canfd/peak_pciefd_main.c b/drivers/net/can/peak_canfd/peak_pciefd_main.c
index b9e28578bc7b..455a3797a200 100644
--- a/drivers/net/can/peak_canfd/peak_pciefd_main.c
+++ b/drivers/net/can/peak_canfd/peak_pciefd_main.c
@@ -58,6 +58,10 @@ MODULE_LICENSE("GPL v2");
#define PCIEFD_REG_SYS_VER1 0x0040 /* version reg #1 */
#define PCIEFD_REG_SYS_VER2 0x0044 /* version reg #2 */
+#define PCIEFD_FW_VERSION(x, y, z) (((u32)(x) << 24) | \
+ ((u32)(y) << 16) | \
+ ((u32)(z) << 8))
+
/* System Control Registers Bits */
#define PCIEFD_SYS_CTL_TS_RST 0x00000001 /* timestamp clock */
#define PCIEFD_SYS_CTL_CLK_EN 0x00000002 /* system clock */
@@ -782,6 +786,21 @@ static int peak_pciefd_probe(struct pci_dev *pdev,
"%ux CAN-FD PCAN-PCIe FPGA v%u.%u.%u:\n", can_count,
hw_ver_major, hw_ver_minor, hw_ver_sub);
+#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
+ /* FW < v3.3.0 DMA logic doesn't handle correctly the mix of 32-bit and
+ * 64-bit logical addresses: this workaround forces usage of 32-bit
+ * DMA addresses only when such a fw is detected.
+ */
+ if (PCIEFD_FW_VERSION(hw_ver_major, hw_ver_minor, hw_ver_sub) <
+ PCIEFD_FW_VERSION(3, 3, 0)) {
+ err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
+ if (err)
+ dev_warn(&pdev->dev,
+ "warning: can't set DMA mask %llxh (err %d)\n",
+ DMA_BIT_MASK(32), err);
+ }
+#endif
+
/* stop system clock */
pciefd_sys_writereg(pciefd, PCIEFD_SYS_CTL_CLK_EN,
PCIEFD_REG_SYS_CTL_CLR);
--
2.18.0
Greg,
this series contains backports of the following upstream commits:
243a4f8126fc ubi: Introduce vol_ignored()
fdf10ed710c0 ubi: Rework Fastmap attach base code
74f2c6e9a47c ubi: Be more paranoid while seaching for the most recent Fastmap
2e8f08deabbc ubi: Fix races around ubi_refill_pools()
f7d11b33d4e8 ubi: Fix Fastmap's update_vol()
5793f39de7f6 ubi: fastmap: Erase outdated anchor PEBs during attach
The first two patches are not directly stable patches but the other patches
depend on them.
Richard Weinberger (5):
ubi: Introduce vol_ignored()
ubi: Rework Fastmap attach base code
ubi: Be more paranoid while seaching for the most recent Fastmap
ubi: Fix races around ubi_refill_pools()
ubi: Fix Fastmap's update_vol()
Sascha Hauer (1):
ubi: fastmap: Erase outdated anchor PEBs during attach
drivers/mtd/ubi/attach.c | 139 ++++++++++++++++++++++++++---------
drivers/mtd/ubi/eba.c | 4 +-
drivers/mtd/ubi/fastmap-wl.c | 6 +-
drivers/mtd/ubi/fastmap.c | 51 +++++++++++--
drivers/mtd/ubi/ubi.h | 46 +++++++++++-
drivers/mtd/ubi/wl.c | 114 ++++++++++++++++++++++------
6 files changed, 292 insertions(+), 68 deletions(-)
--
2.18.0
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b03897cf318dfc47de33a7ecbc7655584266f034 Mon Sep 17 00:00:00 2001
From: "Gautham R. Shenoy" <ego(a)linux.vnet.ibm.com>
Date: Wed, 18 Jul 2018 14:03:16 +0530
Subject: [PATCH] powerpc/powernv: Fix save/restore of SPRG3 on entry/exit from
stop (idle)
On 64-bit servers, SPRN_SPRG3 and its userspace read-only mirror
SPRN_USPRG3 are used as userspace VDSO write and read registers
respectively.
SPRN_SPRG3 is lost when we enter stop4 and above, and is currently not
restored. As a result, any read from SPRN_USPRG3 returns zero on an
exit from stop4 (Power9 only) and above.
Thus in this situation, on POWER9, any call from sched_getcpu() always
returns zero, as on powerpc, we call __kernel_getcpu() which relies
upon SPRN_USPRG3 to report the CPU and NUMA node information.
Fix this by restoring SPRN_SPRG3 on wake up from a deep stop state
with the sprg_vdso value that is cached in PACA.
Fixes: e1c1cfed5432 ("powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle")
Cc: stable(a)vger.kernel.org # v4.14+
Reported-by: Florian Weimer <fweimer(a)redhat.com>
Signed-off-by: Gautham R. Shenoy <ego(a)linux.vnet.ibm.com>
Reviewed-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index e734f6e45abc..689306118b48 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -144,7 +144,9 @@ power9_restore_additional_sprs:
mtspr SPRN_MMCR1, r4
ld r3, STOP_MMCR2(r13)
+ ld r4, PACA_SPRG_VDSO(r13)
mtspr SPRN_MMCR2, r3
+ mtspr SPRN_SPRG3, r4
blr
/*
Correcting the stable ML address ...
On 23/07/18 10:59, Jon Hunter wrote:
> Please include the following fix for stable v4.4.y. If the PLL_U is not
> configured by the bootloader, then without this change it will not be
> configured by the kernel and this will cause USB host support to fail
> which uses the PLL_U for its clock.
>
> Please note that this patch did not apply cleanly to v4.4.y, so I have
> back-ported, but the resulting change is the same as the original.
>
>>From 797097301860c64b63346d068ba4fe4992bd5021 Mon Sep 17 00:00:00 2001
> From: Lucas Stach <dev(a)lynxeye.de>
> Date: Mon, 29 Feb 2016 21:46:07 +0100
> Subject: [PATCH] clk: tegra: Fix PLL_U post divider and initial rate on
> Tegra30
>
> commit 797097301860c64b63346d068ba4fe4992bd5021 upstream
>
> The post divider value in the frequency table is wrong as it would lead
> to the PLL producing an output rate of 960 MHz instead of the desired
> 480 MHz. This wasn't a problem as nothing used the table to actually
> initialize the PLL rate, but the bootloader configuration was used
> unaltered.
>
> If the bootloader does not set up the PLL it will fail to come when used
> under Linux. To fix this don't rely on the bootloader, but set the
> correct rate in the clock driver.
>
> Change-Id: I9375c24ef0d48b1b98be10378e8d593299b0453b
> Signed-off-by: Lucas Stach <dev(a)lynxeye.de>
> Signed-off-by: Thierry Reding <treding(a)nvidia.com>
> [jonathanh(a)nvidia.com: Back-ported to stable v4.4.y]
> Signed-off-by: Jon Hunter <jonathanh(a)nvidia.com>
> ---
> drivers/clk/tegra/clk-tegra30.c | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/clk/tegra/clk-tegra30.c b/drivers/clk/tegra/clk-tegra30.c
> index 8c41c6fcb9ee..acf83569f86f 100644
> --- a/drivers/clk/tegra/clk-tegra30.c
> +++ b/drivers/clk/tegra/clk-tegra30.c
> @@ -333,11 +333,11 @@ static struct pdiv_map pllu_p[] = {
> };
>
> static struct tegra_clk_pll_freq_table pll_u_freq_table[] = {
> - { 12000000, 480000000, 960, 12, 0, 12},
> - { 13000000, 480000000, 960, 13, 0, 12},
> - { 16800000, 480000000, 400, 7, 0, 5},
> - { 19200000, 480000000, 200, 4, 0, 3},
> - { 26000000, 480000000, 960, 26, 0, 12},
> + { 12000000, 480000000, 960, 12, 2, 12 },
> + { 13000000, 480000000, 960, 13, 2, 12 },
> + { 16800000, 480000000, 400, 7, 2, 5 },
> + { 19200000, 480000000, 200, 4, 2, 3 },
> + { 26000000, 480000000, 960, 26, 2, 12 },
> { 0, 0, 0, 0, 0, 0 },
> };
>
> @@ -1372,6 +1372,7 @@ static struct tegra_clk_init_table init_table[] __initdata = {
> {TEGRA30_CLK_GR2D, TEGRA30_CLK_PLL_C, 300000000, 0},
> {TEGRA30_CLK_GR3D, TEGRA30_CLK_PLL_C, 300000000, 0},
> {TEGRA30_CLK_GR3D2, TEGRA30_CLK_PLL_C, 300000000, 0},
> + { TEGRA30_CLK_PLL_U, TEGRA30_CLK_CLK_MAX, 480000000, 0 },
> {TEGRA30_CLK_CLK_MAX, TEGRA30_CLK_CLK_MAX, 0, 0}, /* This MUST be the last entry. */
> };
>
>
--
nvpublic
Adding stable and lkml.
Sorry for spam others.
-Mukesh
On 7/23/2018 1:57 PM, Mukesh Ojha wrote:
>
> Hi All,
>
> I wanted to discuss about one of the corner case exists in 4.9 kernel
> (4.9.x) where
> If hotplug of one of the CPU fails due to failure in one of the callback,
> which is to be called after "notify:online"(as notify_online will
> create sysfs nodes
> for the hotplug cpu) .
>
> So, while cleaning up notify_dead() does not get called as step
> <https://elixir.bootlin.com/linux/v4.9/ident/step>->skip_onerr set to
> true for "notify:prepare"and due to that sysfs nodes of that cpu does
> not get
> cleaned up which can cause issue in next hotplug attempt of that cpu.
>
> Fails
> cpuhp_up_callbacks
> <https://elixir.bootlin.com/linux/v4.9/ident/cpuhp_up_callbacks> =>
> cpuhp_invoke_callback
> <https://elixir.bootlin.com/linux/v4.9/ident/cpuhp_invoke_callback> =>
> undo_cpu_up <https://elixir.bootlin.com/linux/v4.9/ident/undo_cpu_up>
>
> .name = "notify:prepare",
> .teardown.single = notify_dead
> <https://elixir.bootlin.com/linux/v4.9/ident/notify_dead>,
> .skip_onerr = true,
>
> I think the possible solution here could be to remove the
> - .skip_onerr = true,
>
> for "notify:prepare"so that CPU_DEAD notification get send.
>
> Please, feel free to suggest if it has any side-effect as i don't feel
> any.
>
> Ref:
>
> https://elixir.bootlin.com/linux/v4.9/source/kernel/cpu.c#L458
>
> Cheers,
> Mukesh
>
>
>
>
>
>
Please cherry-pick upstream commits:
d03db2b "compiler-gcc.h: Add __attribute__((gnu_inline)) to all inline
declarations"
0e2e160 "x86/asm: Add _ASM_ARG* constants for argument registers to <asm/asm.h>"
d0a8d93 "x86/paravirt: Make native_save_fl() extern inline"
To stable branches 4.4+. They will allow 4.4+ x86 kernels compiled
with Clang and have the configs CONFIG_STACK_PROTECTOR_STRONG and
CONFIG_PARAVIRT to boot. They also allow gcc 5.1+ users to have
consistent `extern inline` semantics.
In response to:
https://android-review.googlesource.com/c/kernel/common/+/716477#message-29…
One of these days I'll remember to cc stable in the commit message
properly...sorry!
--
Thanks,
~Nick Desaulniers
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a8f688ec437dc2045cc8f0c89fe877d5803850da Mon Sep 17 00:00:00 2001
From: Chuck Lever <chuck.lever(a)oracle.com>
Date: Fri, 4 May 2018 15:35:46 -0400
Subject: [PATCH] xprtrdma: Return -ENOBUFS when no pages are available
The use of -EAGAIN in rpcrdma_convert_iovs() is a latent bug: the
transport never calls xprt_write_space() when more pages become
available. -ENOBUFS will trigger the correct "delay briefly and call
again" logic.
Fixes: 7a89f9c626e3 ("xprtrdma: Honor ->send_request API contract")
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
Cc: stable(a)vger.kernel.org # 4.8+
Signed-off-by: Anna Schumaker <Anna.Schumaker(a)Netapp.com>
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index d676106295ff..1d7857919d3d 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -231,7 +231,7 @@ rpcrdma_convert_iovs(struct rpcrdma_xprt *r_xprt, struct xdr_buf *xdrbuf,
*/
*ppages = alloc_page(GFP_ATOMIC);
if (!*ppages)
- return -EAGAIN;
+ return -ENOBUFS;
}
seg->mr_page = *ppages;
seg->mr_offset = (char *)page_base;
Tree/Branch: v4.4.143
Git describe: v4.4.143
Commit: a8ea6276d0 Linux 4.4.143
Build Time: 63 min 40 sec
Passed: 10 / 10 (100.00 %)
Failed: 0 / 10 ( 0.00 %)
Errors: 0
Warnings: 31
Section Mismatches: 0
-------------------------------------------------------------------------------
defconfigs with issues (other than build errors):
19 warnings 0 mismatches : arm64-allmodconfig
17 warnings 0 mismatches : x86_64-allmodconfig
-------------------------------------------------------------------------------
Warnings Summary: 31
3 warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
2 ../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
2 ../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
2 ../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2752 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2720 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4759:1: warning: the frame size of 2056 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2096 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 6784 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5864 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5840 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2304 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2104 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2552 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2544 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3264 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3248 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 3008 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 2992 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5296 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv0367.c:3147:1: warning: the frame size of 4144 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv0367.c:2490:1: warning: the frame size of 3424 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2984 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2976 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4336 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4328 bytes is larger than 2048 bytes [-Wframe-larger-than=]
===============================================================================
Detailed per-defconfig build reports below:
-------------------------------------------------------------------------------
arm64-allmodconfig : PASS, 0 errors, 19 warnings, 0 section mismatches
Warnings:
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 2992 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3248 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2544 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5840 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 6784 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv0367.c:2490:1: warning: the frame size of 3424 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2976 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4336 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2720 bytes is larger than 2048 bytes [-Wframe-larger-than=]
-------------------------------------------------------------------------------
x86_64-allmodconfig : PASS, 0 errors, 17 warnings, 0 section mismatches
Warnings:
../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 3008 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2104 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2304 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2096 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3264 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5296 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4759:1: warning: the frame size of 2056 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2552 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5864 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv0367.c:3147:1: warning: the frame size of 4144 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2984 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4328 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2752 bytes is larger than 2048 bytes [-Wframe-larger-than=]
-------------------------------------------------------------------------------
Passed with no errors, warnings or mismatches:
arm64-allnoconfig
arm-multi_v5_defconfig
arm-multi_v7_defconfig
x86_64-defconfig
arm-allmodconfig
arm-allnoconfig
x86_64-allnoconfig
arm64-defconfig
While working on extended rand for last_error/first_error timestamps,
I noticed that the endianess is wrong, we access the little-endian
fields in struct ext4_super_block as native-endian when we print them.
This adds a special case in ext4_attr_show() and ext4_attr_store()
to byteswap the superblock fields if needed.
In older kernels, this code was part of super.c, it got moved to sysfs.c
in linux-4.4.
Cc: stable(a)vger.kernel.org
Fixes: 52c198c6820f ("ext4: add sysfs entry showing whether the fs contains errors")
Reviewed-by: Andreas Dilger <adilger(a)dilger.ca>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
fs/ext4/sysfs.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/fs/ext4/sysfs.c b/fs/ext4/sysfs.c
index f34da0bb8f17..b970a200f20c 100644
--- a/fs/ext4/sysfs.c
+++ b/fs/ext4/sysfs.c
@@ -274,8 +274,12 @@ static ssize_t ext4_attr_show(struct kobject *kobj,
case attr_pointer_ui:
if (!ptr)
return 0;
- return snprintf(buf, PAGE_SIZE, "%u\n",
- *((unsigned int *) ptr));
+ if (a->attr_ptr == ptr_ext4_super_block_offset)
+ return snprintf(buf, PAGE_SIZE, "%u\n",
+ le32_to_cpup(ptr));
+ else
+ return snprintf(buf, PAGE_SIZE, "%u\n",
+ *((unsigned int *) ptr));
case attr_pointer_atomic:
if (!ptr)
return 0;
@@ -308,7 +312,10 @@ static ssize_t ext4_attr_store(struct kobject *kobj,
ret = kstrtoul(skip_spaces(buf), 0, &t);
if (ret)
return ret;
- *((unsigned int *) ptr) = t;
+ if (a->attr_ptr == ptr_ext4_super_block_offset)
+ *((__le32 *) ptr) = cpu_to_le32(t);
+ else
+ *((unsigned int *) ptr) = t;
return len;
case attr_inode_readahead:
return inode_readahead_blks_store(sbi, buf, len);
--
2.9.0
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
commit 229bc19fd7aca4f37964af06e3583c1c8f36b5d6 upstream.
Don't rely on event interrupt (EINT) bit alone to detect pending port
change in resume. If no change event is detected the host may be suspended
again, oterwise roothubs are resumed.
There is a lag in xHC setting EINT. If we don't notice the pending change
in resume, and the controller is runtime suspeded again, it causes the
event handler to assume host is dead as it will fail to read xHC registers
once PCI puts the controller to D3 state.
[ 268.520969] xhci_hcd: xhci_resume: starting port polling.
[ 268.520985] xhci_hcd: xhci_hub_status_data: stopping port polling.
[ 268.521030] xhci_hcd: xhci_suspend: stopping port polling.
[ 268.521040] xhci_hcd: // Setting command ring address to 0x349bd001
[ 268.521139] xhci_hcd: Port Status Change Event for port 3
[ 268.521149] xhci_hcd: resume root hub
[ 268.521163] xhci_hcd: port resume event for port 3
[ 268.521168] xhci_hcd: xHC is not running.
[ 268.521174] xhci_hcd: handle_port_status: starting port polling.
[ 268.596322] xhci_hcd: xhci_hc_died: xHCI host controller not responding, assume dead
The EINT lag is described in a additional note in xhci specs 4.19.2:
"Due to internal xHC scheduling and system delays, there will be a lag
between a change bit being set and the Port Status Change Event that it
generated being written to the Event Ring. If SW reads the PORTSC and
sees a change bit set, there is no guarantee that the corresponding Port
Status Change Event has already been written into the Event Ring."
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
---
drivers/usb/host/xhci.c | 40 +++++++++++++++++++++++++++++++++++++---
drivers/usb/host/xhci.h | 4 ++++
2 files changed, 41 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index e5bccc6d49cf..fe84b36627ec 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -856,6 +856,41 @@ static void xhci_disable_port_wake_on_bits(struct xhci_hcd *xhci)
spin_unlock_irqrestore(&xhci->lock, flags);
}
+static bool xhci_pending_portevent(struct xhci_hcd *xhci)
+{
+ __le32 __iomem **port_array;
+ int port_index;
+ u32 status;
+ u32 portsc;
+
+ status = readl(&xhci->op_regs->status);
+ if (status & STS_EINT)
+ return true;
+ /*
+ * Checking STS_EINT is not enough as there is a lag between a change
+ * bit being set and the Port Status Change Event that it generated
+ * being written to the Event Ring. See note in xhci 1.1 section 4.19.2.
+ */
+
+ port_index = xhci->num_usb2_ports;
+ port_array = xhci->usb2_ports;
+ while (port_index--) {
+ portsc = readl(port_array[port_index]);
+ if (portsc & PORT_CHANGE_MASK ||
+ (portsc & PORT_PLS_MASK) == XDEV_RESUME)
+ return true;
+ }
+ port_index = xhci->num_usb3_ports;
+ port_array = xhci->usb3_ports;
+ while (port_index--) {
+ portsc = readl(port_array[port_index]);
+ if (portsc & PORT_CHANGE_MASK ||
+ (portsc & PORT_PLS_MASK) == XDEV_RESUME)
+ return true;
+ }
+ return false;
+}
+
/*
* Stop HC (not bus-specific)
*
@@ -955,7 +990,7 @@ EXPORT_SYMBOL_GPL(xhci_suspend);
*/
int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
{
- u32 command, temp = 0, status;
+ u32 command, temp = 0;
struct usb_hcd *hcd = xhci_to_hcd(xhci);
struct usb_hcd *secondary_hcd;
int retval = 0;
@@ -1077,8 +1112,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
done:
if (retval == 0) {
/* Resume root hubs only when have pending events. */
- status = readl(&xhci->op_regs->status);
- if (status & STS_EINT) {
+ if (xhci_pending_portevent(xhci)) {
usb_hcd_resume_root_hub(xhci->shared_hcd);
usb_hcd_resume_root_hub(hcd);
}
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 2a72060dda1b..11232e62b898 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -392,6 +392,10 @@ struct xhci_op_regs {
#define PORT_PLC (1 << 22)
/* port configure error change - port failed to configure its link partner */
#define PORT_CEC (1 << 23)
+#define PORT_CHANGE_MASK (PORT_CSC | PORT_PEC | PORT_WRC | PORT_OCC | \
+ PORT_RC | PORT_PLC | PORT_CEC)
+
+
/* Cold Attach Status - xHC can set this bit to report device attached during
* Sx state. Warm port reset should be perfomed to clear this bit and move port
* to connected state.
--
2.17.1
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 02ce6ce2e1d07e31e8314c761a2caa087ea094ce Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher(a)amd.com>
Date: Thu, 12 Jul 2018 08:38:09 -0500
Subject: [PATCH] drm/amdgpu/pp/smu7: use a local variable for toc indexing
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Rather than using the index variable stored in vram. If
the device fails to come back online after a resume cycle,
reads from vram will return all 1s which will cause a
segfault. Based on a patch from Thomas Martitz <kugel(a)rockbox.org>.
This avoids the segfault, but we still need to sort out
why the GPU does not come back online after a resume.
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=105760
Acked-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
index d644a9bb9078..9f407c48d4f0 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
@@ -381,6 +381,7 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
uint32_t fw_to_load;
int result = 0;
struct SMU_DRAMData_TOC *toc;
+ uint32_t num_entries = 0;
if (!hwmgr->reload_fw) {
pr_info("skip reloading...\n");
@@ -422,41 +423,41 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
}
toc = (struct SMU_DRAMData_TOC *)smu_data->header;
- toc->num_entries = 0;
toc->structure_version = 1;
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_RLC_G, &toc->entry[toc->num_entries++]),
+ UCODE_ID_RLC_G, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_CE, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_CE, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_PFP, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_PFP, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_ME, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_ME, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_MEC, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_MEC, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_MEC_JT1, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_MEC_JT1, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_MEC_JT2, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_MEC_JT2, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_SDMA0, &toc->entry[toc->num_entries++]),
+ UCODE_ID_SDMA0, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_SDMA1, &toc->entry[toc->num_entries++]),
+ UCODE_ID_SDMA1, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
if (!hwmgr->not_vf)
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_MEC_STORAGE, &toc->entry[toc->num_entries++]),
+ UCODE_ID_MEC_STORAGE, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
+ toc->num_entries = num_entries;
smu7_send_msg_to_smc_with_parameter(hwmgr, PPSMC_MSG_DRV_DRAM_ADDR_HI, upper_32_bits(smu_data->header_buffer.mc_addr));
smu7_send_msg_to_smc_with_parameter(hwmgr, PPSMC_MSG_DRV_DRAM_ADDR_LO, lower_32_bits(smu_data->header_buffer.mc_addr));
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 02ce6ce2e1d07e31e8314c761a2caa087ea094ce Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher(a)amd.com>
Date: Thu, 12 Jul 2018 08:38:09 -0500
Subject: [PATCH] drm/amdgpu/pp/smu7: use a local variable for toc indexing
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Rather than using the index variable stored in vram. If
the device fails to come back online after a resume cycle,
reads from vram will return all 1s which will cause a
segfault. Based on a patch from Thomas Martitz <kugel(a)rockbox.org>.
This avoids the segfault, but we still need to sort out
why the GPU does not come back online after a resume.
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=105760
Acked-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
index d644a9bb9078..9f407c48d4f0 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
@@ -381,6 +381,7 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
uint32_t fw_to_load;
int result = 0;
struct SMU_DRAMData_TOC *toc;
+ uint32_t num_entries = 0;
if (!hwmgr->reload_fw) {
pr_info("skip reloading...\n");
@@ -422,41 +423,41 @@ int smu7_request_smu_load_fw(struct pp_hwmgr *hwmgr)
}
toc = (struct SMU_DRAMData_TOC *)smu_data->header;
- toc->num_entries = 0;
toc->structure_version = 1;
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_RLC_G, &toc->entry[toc->num_entries++]),
+ UCODE_ID_RLC_G, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_CE, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_CE, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_PFP, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_PFP, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_ME, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_ME, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_MEC, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_MEC, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_MEC_JT1, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_MEC_JT1, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_CP_MEC_JT2, &toc->entry[toc->num_entries++]),
+ UCODE_ID_CP_MEC_JT2, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_SDMA0, &toc->entry[toc->num_entries++]),
+ UCODE_ID_SDMA0, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_SDMA1, &toc->entry[toc->num_entries++]),
+ UCODE_ID_SDMA1, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
if (!hwmgr->not_vf)
PP_ASSERT_WITH_CODE(0 == smu7_populate_single_firmware_entry(hwmgr,
- UCODE_ID_MEC_STORAGE, &toc->entry[toc->num_entries++]),
+ UCODE_ID_MEC_STORAGE, &toc->entry[num_entries++]),
"Failed to Get Firmware Entry.", return -EINVAL);
+ toc->num_entries = num_entries;
smu7_send_msg_to_smc_with_parameter(hwmgr, PPSMC_MSG_DRV_DRAM_ADDR_HI, upper_32_bits(smu_data->header_buffer.mc_addr));
smu7_send_msg_to_smc_with_parameter(hwmgr, PPSMC_MSG_DRV_DRAM_ADDR_LO, lower_32_bits(smu_data->header_buffer.mc_addr));
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9d4a0d4cdc8b5904ec7c9b9e04bab3e9e60d7a74 Mon Sep 17 00:00:00 2001
From: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
Date: Thu, 5 Jul 2018 14:49:34 -0400
Subject: [PATCH] drm/amdgpu: Verify root PD is mapped into kernel address
space (v4)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Problem: When PD/PT update made by CPU root PD was not yet mapped causing
page fault.
Fix: Verify root PD is mapped into CPU address space.
v2:
Make sure that we add the root PD to the relocated list
since then it's get mapped into CPU address space bt default
in amdgpu_vm_update_directories.
v3:
Drop change to not move kernel type BOs to evicted list.
v4:
Remove redundant bo move to relocated list.
Link: https://bugs.freedesktop.org/show_bug.cgi?id=107065
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index edf16b2b957a..fdcb498f6d19 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -107,6 +107,9 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,
return;
list_add_tail(&base->bo_list, &bo->va);
+ if (bo->tbo.type == ttm_bo_type_kernel)
+ list_move(&base->vm_status, &vm->relocated);
+
if (bo->tbo.resv != vm->root.base.bo->tbo.resv)
return;
@@ -468,7 +471,6 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device *adev,
pt->parent = amdgpu_bo_ref(parent->base.bo);
amdgpu_vm_bo_base_init(&entry->base, vm, pt);
- list_move(&entry->base.vm_status, &vm->relocated);
}
if (level < AMDGPU_VM_PTB) {
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9d4a0d4cdc8b5904ec7c9b9e04bab3e9e60d7a74 Mon Sep 17 00:00:00 2001
From: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
Date: Thu, 5 Jul 2018 14:49:34 -0400
Subject: [PATCH] drm/amdgpu: Verify root PD is mapped into kernel address
space (v4)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Problem: When PD/PT update made by CPU root PD was not yet mapped causing
page fault.
Fix: Verify root PD is mapped into CPU address space.
v2:
Make sure that we add the root PD to the relocated list
since then it's get mapped into CPU address space bt default
in amdgpu_vm_update_directories.
v3:
Drop change to not move kernel type BOs to evicted list.
v4:
Remove redundant bo move to relocated list.
Link: https://bugs.freedesktop.org/show_bug.cgi?id=107065
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index edf16b2b957a..fdcb498f6d19 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -107,6 +107,9 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,
return;
list_add_tail(&base->bo_list, &bo->va);
+ if (bo->tbo.type == ttm_bo_type_kernel)
+ list_move(&base->vm_status, &vm->relocated);
+
if (bo->tbo.resv != vm->root.base.bo->tbo.resv)
return;
@@ -468,7 +471,6 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device *adev,
pt->parent = amdgpu_bo_ref(parent->base.bo);
amdgpu_vm_bo_base_init(&entry->base, vm, pt);
- list_move(&entry->base.vm_status, &vm->relocated);
}
if (level < AMDGPU_VM_PTB) {
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 76fa4975f3ed12d15762bc979ca44078598ed8ee Mon Sep 17 00:00:00 2001
From: Alexey Kardashevskiy <aik(a)ozlabs.ru>
Date: Tue, 17 Jul 2018 17:19:13 +1000
Subject: [PATCH] KVM: PPC: Check if IOMMU page is contained in the pinned
physical page
A VM which has:
- a DMA capable device passed through to it (eg. network card);
- running a malicious kernel that ignores H_PUT_TCE failure;
- capability of using IOMMU pages bigger that physical pages
can create an IOMMU mapping that exposes (for example) 16MB of
the host physical memory to the device when only 64K was allocated to the VM.
The remaining 16MB - 64K will be some other content of host memory, possibly
including pages of the VM, but also pages of host kernel memory, host
programs or other VMs.
The attacking VM does not control the location of the page it can map,
and is only allowed to map as many pages as it has pages of RAM.
We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that
an IOMMU page is contained in the physical page so the PCI hardware won't
get access to unassigned host memory; however this check is missing in
the KVM fastpath (H_PUT_TCE accelerated code). We were lucky so far and
did not hit this yet as the very first time when the mapping happens
we do not have tbl::it_userspace allocated yet and fall back to
the userspace which in turn calls VFIO IOMMU driver, this fails and
the guest does not retry,
This stores the smallest preregistered page size in the preregistered
region descriptor and changes the mm_iommu_xxx API to check this against
the IOMMU page size.
This calculates maximum page size as a minimum of the natural region
alignment and compound page size. For the page shift this uses the shift
returned by find_linux_pte() which indicates how the page is mapped to
the current userspace - if the page is huge and this is not a zero, then
it is a leaf pte and the page is mapped within the range.
Fixes: 121f80ba68f1 ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO")
Cc: stable(a)vger.kernel.org # v4.12+
Signed-off-by: Alexey Kardashevskiy <aik(a)ozlabs.ru>
Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 896efa559996..79d570cbf332 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -35,9 +35,9 @@ extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup_rm(
extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
unsigned long ua, unsigned long entries);
extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa);
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa);
extern long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa);
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa);
extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
extern void mm_iommu_mapped_dec(struct mm_iommu_table_group_mem_t *mem);
#endif
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index d066e37551ec..8c456fa691a5 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -449,7 +449,7 @@ long kvmppc_tce_iommu_do_map(struct kvm *kvm, struct iommu_table *tbl,
/* This only handles v2 IOMMU type, v1 is handled via ioctl() */
return H_TOO_HARD;
- if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, &hpa)))
+ if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, tbl->it_page_shift, &hpa)))
return H_HARDWARE;
if (mm_iommu_mapped_inc(mem))
diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c b/arch/powerpc/kvm/book3s_64_vio_hv.c
index 925fc316a104..5b298f5a1a14 100644
--- a/arch/powerpc/kvm/book3s_64_vio_hv.c
+++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
@@ -279,7 +279,8 @@ static long kvmppc_rm_tce_iommu_do_map(struct kvm *kvm, struct iommu_table *tbl,
if (!mem)
return H_TOO_HARD;
- if (WARN_ON_ONCE_RM(mm_iommu_ua_to_hpa_rm(mem, ua, &hpa)))
+ if (WARN_ON_ONCE_RM(mm_iommu_ua_to_hpa_rm(mem, ua, tbl->it_page_shift,
+ &hpa)))
return H_HARDWARE;
pua = (void *) vmalloc_to_phys(pua);
@@ -469,7 +470,8 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
mem = mm_iommu_lookup_rm(vcpu->kvm->mm, ua, IOMMU_PAGE_SIZE_4K);
if (mem)
- prereg = mm_iommu_ua_to_hpa_rm(mem, ua, &tces) == 0;
+ prereg = mm_iommu_ua_to_hpa_rm(mem, ua,
+ IOMMU_PAGE_SHIFT_4K, &tces) == 0;
}
if (!prereg) {
diff --git a/arch/powerpc/mm/mmu_context_iommu.c b/arch/powerpc/mm/mmu_context_iommu.c
index abb43646927a..a4ca57612558 100644
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -19,6 +19,7 @@
#include <linux/hugetlb.h>
#include <linux/swap.h>
#include <asm/mmu_context.h>
+#include <asm/pte-walk.h>
static DEFINE_MUTEX(mem_list_mutex);
@@ -27,6 +28,7 @@ struct mm_iommu_table_group_mem_t {
struct rcu_head rcu;
unsigned long used;
atomic64_t mapped;
+ unsigned int pageshift;
u64 ua; /* userspace address */
u64 entries; /* number of entries in hpas[] */
u64 *hpas; /* vmalloc'ed */
@@ -125,6 +127,8 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
{
struct mm_iommu_table_group_mem_t *mem;
long i, j, ret = 0, locked_entries = 0;
+ unsigned int pageshift;
+ unsigned long flags;
struct page *page = NULL;
mutex_lock(&mem_list_mutex);
@@ -159,6 +163,12 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
goto unlock_exit;
}
+ /*
+ * For a starting point for a maximum page size calculation
+ * we use @ua and @entries natural alignment to allow IOMMU pages
+ * smaller than huge pages but still bigger than PAGE_SIZE.
+ */
+ mem->pageshift = __ffs(ua | (entries << PAGE_SHIFT));
mem->hpas = vzalloc(array_size(entries, sizeof(mem->hpas[0])));
if (!mem->hpas) {
kfree(mem);
@@ -199,6 +209,23 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
}
}
populate:
+ pageshift = PAGE_SHIFT;
+ if (PageCompound(page)) {
+ pte_t *pte;
+ struct page *head = compound_head(page);
+ unsigned int compshift = compound_order(head);
+
+ local_irq_save(flags); /* disables as well */
+ pte = find_linux_pte(mm->pgd, ua, NULL, &pageshift);
+ local_irq_restore(flags);
+
+ /* Double check it is still the same pinned page */
+ if (pte && pte_page(*pte) == head &&
+ pageshift == compshift)
+ pageshift = max_t(unsigned int, pageshift,
+ PAGE_SHIFT);
+ }
+ mem->pageshift = min(mem->pageshift, pageshift);
mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
}
@@ -349,7 +376,7 @@ struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
EXPORT_SYMBOL_GPL(mm_iommu_find);
long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa)
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa)
{
const long entry = (ua - mem->ua) >> PAGE_SHIFT;
u64 *va = &mem->hpas[entry];
@@ -357,6 +384,9 @@ long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
if (entry >= mem->entries)
return -EFAULT;
+ if (pageshift > mem->pageshift)
+ return -EFAULT;
+
*hpa = *va | (ua & ~PAGE_MASK);
return 0;
@@ -364,7 +394,7 @@ long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
EXPORT_SYMBOL_GPL(mm_iommu_ua_to_hpa);
long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa)
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa)
{
const long entry = (ua - mem->ua) >> PAGE_SHIFT;
void *va = &mem->hpas[entry];
@@ -373,6 +403,9 @@ long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
if (entry >= mem->entries)
return -EFAULT;
+ if (pageshift > mem->pageshift)
+ return -EFAULT;
+
pa = (void *) vmalloc_to_phys(va);
if (!pa)
return -EFAULT;
diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
index 2da5f054257a..7cd63b0c1a46 100644
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -467,7 +467,7 @@ static int tce_iommu_prereg_ua_to_hpa(struct tce_container *container,
if (!mem)
return -EINVAL;
- ret = mm_iommu_ua_to_hpa(mem, tce, phpa);
+ ret = mm_iommu_ua_to_hpa(mem, tce, shift, phpa);
if (ret)
return -EINVAL;
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 76fa4975f3ed12d15762bc979ca44078598ed8ee Mon Sep 17 00:00:00 2001
From: Alexey Kardashevskiy <aik(a)ozlabs.ru>
Date: Tue, 17 Jul 2018 17:19:13 +1000
Subject: [PATCH] KVM: PPC: Check if IOMMU page is contained in the pinned
physical page
A VM which has:
- a DMA capable device passed through to it (eg. network card);
- running a malicious kernel that ignores H_PUT_TCE failure;
- capability of using IOMMU pages bigger that physical pages
can create an IOMMU mapping that exposes (for example) 16MB of
the host physical memory to the device when only 64K was allocated to the VM.
The remaining 16MB - 64K will be some other content of host memory, possibly
including pages of the VM, but also pages of host kernel memory, host
programs or other VMs.
The attacking VM does not control the location of the page it can map,
and is only allowed to map as many pages as it has pages of RAM.
We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that
an IOMMU page is contained in the physical page so the PCI hardware won't
get access to unassigned host memory; however this check is missing in
the KVM fastpath (H_PUT_TCE accelerated code). We were lucky so far and
did not hit this yet as the very first time when the mapping happens
we do not have tbl::it_userspace allocated yet and fall back to
the userspace which in turn calls VFIO IOMMU driver, this fails and
the guest does not retry,
This stores the smallest preregistered page size in the preregistered
region descriptor and changes the mm_iommu_xxx API to check this against
the IOMMU page size.
This calculates maximum page size as a minimum of the natural region
alignment and compound page size. For the page shift this uses the shift
returned by find_linux_pte() which indicates how the page is mapped to
the current userspace - if the page is huge and this is not a zero, then
it is a leaf pte and the page is mapped within the range.
Fixes: 121f80ba68f1 ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO")
Cc: stable(a)vger.kernel.org # v4.12+
Signed-off-by: Alexey Kardashevskiy <aik(a)ozlabs.ru>
Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 896efa559996..79d570cbf332 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -35,9 +35,9 @@ extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup_rm(
extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
unsigned long ua, unsigned long entries);
extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa);
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa);
extern long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa);
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa);
extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
extern void mm_iommu_mapped_dec(struct mm_iommu_table_group_mem_t *mem);
#endif
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index d066e37551ec..8c456fa691a5 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -449,7 +449,7 @@ long kvmppc_tce_iommu_do_map(struct kvm *kvm, struct iommu_table *tbl,
/* This only handles v2 IOMMU type, v1 is handled via ioctl() */
return H_TOO_HARD;
- if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, &hpa)))
+ if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, tbl->it_page_shift, &hpa)))
return H_HARDWARE;
if (mm_iommu_mapped_inc(mem))
diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c b/arch/powerpc/kvm/book3s_64_vio_hv.c
index 925fc316a104..5b298f5a1a14 100644
--- a/arch/powerpc/kvm/book3s_64_vio_hv.c
+++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
@@ -279,7 +279,8 @@ static long kvmppc_rm_tce_iommu_do_map(struct kvm *kvm, struct iommu_table *tbl,
if (!mem)
return H_TOO_HARD;
- if (WARN_ON_ONCE_RM(mm_iommu_ua_to_hpa_rm(mem, ua, &hpa)))
+ if (WARN_ON_ONCE_RM(mm_iommu_ua_to_hpa_rm(mem, ua, tbl->it_page_shift,
+ &hpa)))
return H_HARDWARE;
pua = (void *) vmalloc_to_phys(pua);
@@ -469,7 +470,8 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
mem = mm_iommu_lookup_rm(vcpu->kvm->mm, ua, IOMMU_PAGE_SIZE_4K);
if (mem)
- prereg = mm_iommu_ua_to_hpa_rm(mem, ua, &tces) == 0;
+ prereg = mm_iommu_ua_to_hpa_rm(mem, ua,
+ IOMMU_PAGE_SHIFT_4K, &tces) == 0;
}
if (!prereg) {
diff --git a/arch/powerpc/mm/mmu_context_iommu.c b/arch/powerpc/mm/mmu_context_iommu.c
index abb43646927a..a4ca57612558 100644
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -19,6 +19,7 @@
#include <linux/hugetlb.h>
#include <linux/swap.h>
#include <asm/mmu_context.h>
+#include <asm/pte-walk.h>
static DEFINE_MUTEX(mem_list_mutex);
@@ -27,6 +28,7 @@ struct mm_iommu_table_group_mem_t {
struct rcu_head rcu;
unsigned long used;
atomic64_t mapped;
+ unsigned int pageshift;
u64 ua; /* userspace address */
u64 entries; /* number of entries in hpas[] */
u64 *hpas; /* vmalloc'ed */
@@ -125,6 +127,8 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
{
struct mm_iommu_table_group_mem_t *mem;
long i, j, ret = 0, locked_entries = 0;
+ unsigned int pageshift;
+ unsigned long flags;
struct page *page = NULL;
mutex_lock(&mem_list_mutex);
@@ -159,6 +163,12 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
goto unlock_exit;
}
+ /*
+ * For a starting point for a maximum page size calculation
+ * we use @ua and @entries natural alignment to allow IOMMU pages
+ * smaller than huge pages but still bigger than PAGE_SIZE.
+ */
+ mem->pageshift = __ffs(ua | (entries << PAGE_SHIFT));
mem->hpas = vzalloc(array_size(entries, sizeof(mem->hpas[0])));
if (!mem->hpas) {
kfree(mem);
@@ -199,6 +209,23 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
}
}
populate:
+ pageshift = PAGE_SHIFT;
+ if (PageCompound(page)) {
+ pte_t *pte;
+ struct page *head = compound_head(page);
+ unsigned int compshift = compound_order(head);
+
+ local_irq_save(flags); /* disables as well */
+ pte = find_linux_pte(mm->pgd, ua, NULL, &pageshift);
+ local_irq_restore(flags);
+
+ /* Double check it is still the same pinned page */
+ if (pte && pte_page(*pte) == head &&
+ pageshift == compshift)
+ pageshift = max_t(unsigned int, pageshift,
+ PAGE_SHIFT);
+ }
+ mem->pageshift = min(mem->pageshift, pageshift);
mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
}
@@ -349,7 +376,7 @@ struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
EXPORT_SYMBOL_GPL(mm_iommu_find);
long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa)
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa)
{
const long entry = (ua - mem->ua) >> PAGE_SHIFT;
u64 *va = &mem->hpas[entry];
@@ -357,6 +384,9 @@ long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
if (entry >= mem->entries)
return -EFAULT;
+ if (pageshift > mem->pageshift)
+ return -EFAULT;
+
*hpa = *va | (ua & ~PAGE_MASK);
return 0;
@@ -364,7 +394,7 @@ long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
EXPORT_SYMBOL_GPL(mm_iommu_ua_to_hpa);
long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
- unsigned long ua, unsigned long *hpa)
+ unsigned long ua, unsigned int pageshift, unsigned long *hpa)
{
const long entry = (ua - mem->ua) >> PAGE_SHIFT;
void *va = &mem->hpas[entry];
@@ -373,6 +403,9 @@ long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
if (entry >= mem->entries)
return -EFAULT;
+ if (pageshift > mem->pageshift)
+ return -EFAULT;
+
pa = (void *) vmalloc_to_phys(va);
if (!pa)
return -EFAULT;
diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
index 2da5f054257a..7cd63b0c1a46 100644
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -467,7 +467,7 @@ static int tce_iommu_prereg_ua_to_hpa(struct tce_container *container,
if (!mem)
return -EINVAL;
- ret = mm_iommu_ua_to_hpa(mem, tce, phpa);
+ ret = mm_iommu_ua_to_hpa(mem, tce, shift, phpa);
if (ret)
return -EINVAL;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From bd3599a0e142cd73edd3b6801068ac3f48ac771a Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Thu, 12 Jul 2018 01:36:43 +0100
Subject: [PATCH] Btrfs: fix file data corruption after cloning a range and
fsync
When we clone a range into a file we can end up dropping existing
extent maps (or trimming them) and replacing them with new ones if the
range to be cloned overlaps with a range in the destination inode.
When that happens we add the new extent maps to the list of modified
extents in the inode's extent map tree, so that a "fast" fsync (the flag
BTRFS_INODE_NEEDS_FULL_SYNC not set in the inode) will see the extent maps
and log corresponding extent items. However, at the end of range cloning
operation we do truncate all the pages in the affected range (in order to
ensure future reads will not get stale data). Sometimes this truncation
will release the corresponding extent maps besides the pages from the page
cache. If this happens, then a "fast" fsync operation will miss logging
some extent items, because it relies exclusively on the extent maps being
present in the inode's extent tree, leading to data loss/corruption if
the fsync ends up using the same transaction used by the clone operation
(that transaction was not committed in the meanwhile). An extent map is
released through the callback btrfs_invalidatepage(), which gets called by
truncate_inode_pages_range(), and it calls __btrfs_releasepage(). The
later ends up calling try_release_extent_mapping() which will release the
extent map if some conditions are met, like the file size being greater
than 16Mb, gfp flags allow blocking and the range not being locked (which
is the case during the clone operation) nor being the extent map flagged
as pinned (also the case for cloning).
The following example, turned into a test for fstests, reproduces the
issue:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ xfs_io -f -c "pwrite -S 0x18 9000K 6908K" /mnt/foo
$ xfs_io -f -c "pwrite -S 0x20 2572K 156K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
# reflink destination offset corresponds to the size of file bar,
# 2728Kb minus 4Kb.
$ xfs_io -c ""reflink ${SCRATCH_MNT}/foo 0 2724K 15908K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
$ md5sum /mnt/bar
95a95813a8c2abc9aa75a6c2914a077e /mnt/bar
<power fail>
$ mount /dev/sdb /mnt
$ md5sum /mnt/bar
207fd8d0b161be8a84b945f0df8d5f8d /mnt/bar
# digest should be 95a95813a8c2abc9aa75a6c2914a077e like before the
# power failure
In the above example, the destination offset of the clone operation
corresponds to the size of the "bar" file minus 4Kb. So during the clone
operation, the extent map covering the range from 2572Kb to 2728Kb gets
trimmed so that it ends at offset 2724Kb, and a new extent map covering
the range from 2724Kb to 11724Kb is created. So at the end of the clone
operation when we ask to truncate the pages in the range from 2724Kb to
2724Kb + 15908Kb, the page invalidation callback ends up removing the new
extent map (through try_release_extent_mapping()) when the page at offset
2724Kb is passed to that callback.
Fix this by setting the bit BTRFS_INODE_NEEDS_FULL_SYNC whenever an extent
map is removed at try_release_extent_mapping(), forcing the next fsync to
search for modified extents in the fs/subvolume tree instead of relying on
the presence of extent maps in memory. This way we can continue doing a
"fast" fsync if the destination range of a clone operation does not
overlap with an existing range or if any of the criteria necessary to
remove an extent map at try_release_extent_mapping() is not met (file
size not bigger then 16Mb or gfp flags do not allow blocking).
CC: stable(a)vger.kernel.org # 3.16+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 1aa91d57404a..8fd86e29085f 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4241,8 +4241,9 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
struct extent_map *em;
u64 start = page_offset(page);
u64 end = start + PAGE_SIZE - 1;
- struct extent_io_tree *tree = &BTRFS_I(page->mapping->host)->io_tree;
- struct extent_map_tree *map = &BTRFS_I(page->mapping->host)->extent_tree;
+ struct btrfs_inode *btrfs_inode = BTRFS_I(page->mapping->host);
+ struct extent_io_tree *tree = &btrfs_inode->io_tree;
+ struct extent_map_tree *map = &btrfs_inode->extent_tree;
if (gfpflags_allow_blocking(mask) &&
page->mapping->host->i_size > SZ_16M) {
@@ -4265,6 +4266,8 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
extent_map_end(em) - 1,
EXTENT_LOCKED | EXTENT_WRITEBACK,
0, NULL)) {
+ set_bit(BTRFS_INODE_NEEDS_FULL_SYNC,
+ &btrfs_inode->runtime_flags);
remove_extent_mapping(map, em);
/* once for the rb tree */
free_extent_map(em);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From bd3599a0e142cd73edd3b6801068ac3f48ac771a Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Thu, 12 Jul 2018 01:36:43 +0100
Subject: [PATCH] Btrfs: fix file data corruption after cloning a range and
fsync
When we clone a range into a file we can end up dropping existing
extent maps (or trimming them) and replacing them with new ones if the
range to be cloned overlaps with a range in the destination inode.
When that happens we add the new extent maps to the list of modified
extents in the inode's extent map tree, so that a "fast" fsync (the flag
BTRFS_INODE_NEEDS_FULL_SYNC not set in the inode) will see the extent maps
and log corresponding extent items. However, at the end of range cloning
operation we do truncate all the pages in the affected range (in order to
ensure future reads will not get stale data). Sometimes this truncation
will release the corresponding extent maps besides the pages from the page
cache. If this happens, then a "fast" fsync operation will miss logging
some extent items, because it relies exclusively on the extent maps being
present in the inode's extent tree, leading to data loss/corruption if
the fsync ends up using the same transaction used by the clone operation
(that transaction was not committed in the meanwhile). An extent map is
released through the callback btrfs_invalidatepage(), which gets called by
truncate_inode_pages_range(), and it calls __btrfs_releasepage(). The
later ends up calling try_release_extent_mapping() which will release the
extent map if some conditions are met, like the file size being greater
than 16Mb, gfp flags allow blocking and the range not being locked (which
is the case during the clone operation) nor being the extent map flagged
as pinned (also the case for cloning).
The following example, turned into a test for fstests, reproduces the
issue:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ xfs_io -f -c "pwrite -S 0x18 9000K 6908K" /mnt/foo
$ xfs_io -f -c "pwrite -S 0x20 2572K 156K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
# reflink destination offset corresponds to the size of file bar,
# 2728Kb minus 4Kb.
$ xfs_io -c ""reflink ${SCRATCH_MNT}/foo 0 2724K 15908K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
$ md5sum /mnt/bar
95a95813a8c2abc9aa75a6c2914a077e /mnt/bar
<power fail>
$ mount /dev/sdb /mnt
$ md5sum /mnt/bar
207fd8d0b161be8a84b945f0df8d5f8d /mnt/bar
# digest should be 95a95813a8c2abc9aa75a6c2914a077e like before the
# power failure
In the above example, the destination offset of the clone operation
corresponds to the size of the "bar" file minus 4Kb. So during the clone
operation, the extent map covering the range from 2572Kb to 2728Kb gets
trimmed so that it ends at offset 2724Kb, and a new extent map covering
the range from 2724Kb to 11724Kb is created. So at the end of the clone
operation when we ask to truncate the pages in the range from 2724Kb to
2724Kb + 15908Kb, the page invalidation callback ends up removing the new
extent map (through try_release_extent_mapping()) when the page at offset
2724Kb is passed to that callback.
Fix this by setting the bit BTRFS_INODE_NEEDS_FULL_SYNC whenever an extent
map is removed at try_release_extent_mapping(), forcing the next fsync to
search for modified extents in the fs/subvolume tree instead of relying on
the presence of extent maps in memory. This way we can continue doing a
"fast" fsync if the destination range of a clone operation does not
overlap with an existing range or if any of the criteria necessary to
remove an extent map at try_release_extent_mapping() is not met (file
size not bigger then 16Mb or gfp flags do not allow blocking).
CC: stable(a)vger.kernel.org # 3.16+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 1aa91d57404a..8fd86e29085f 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4241,8 +4241,9 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
struct extent_map *em;
u64 start = page_offset(page);
u64 end = start + PAGE_SIZE - 1;
- struct extent_io_tree *tree = &BTRFS_I(page->mapping->host)->io_tree;
- struct extent_map_tree *map = &BTRFS_I(page->mapping->host)->extent_tree;
+ struct btrfs_inode *btrfs_inode = BTRFS_I(page->mapping->host);
+ struct extent_io_tree *tree = &btrfs_inode->io_tree;
+ struct extent_map_tree *map = &btrfs_inode->extent_tree;
if (gfpflags_allow_blocking(mask) &&
page->mapping->host->i_size > SZ_16M) {
@@ -4265,6 +4266,8 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
extent_map_end(em) - 1,
EXTENT_LOCKED | EXTENT_WRITEBACK,
0, NULL)) {
+ set_bit(BTRFS_INODE_NEEDS_FULL_SYNC,
+ &btrfs_inode->runtime_flags);
remove_extent_mapping(map, em);
/* once for the rb tree */
free_extent_map(em);
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From bd3599a0e142cd73edd3b6801068ac3f48ac771a Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Thu, 12 Jul 2018 01:36:43 +0100
Subject: [PATCH] Btrfs: fix file data corruption after cloning a range and
fsync
When we clone a range into a file we can end up dropping existing
extent maps (or trimming them) and replacing them with new ones if the
range to be cloned overlaps with a range in the destination inode.
When that happens we add the new extent maps to the list of modified
extents in the inode's extent map tree, so that a "fast" fsync (the flag
BTRFS_INODE_NEEDS_FULL_SYNC not set in the inode) will see the extent maps
and log corresponding extent items. However, at the end of range cloning
operation we do truncate all the pages in the affected range (in order to
ensure future reads will not get stale data). Sometimes this truncation
will release the corresponding extent maps besides the pages from the page
cache. If this happens, then a "fast" fsync operation will miss logging
some extent items, because it relies exclusively on the extent maps being
present in the inode's extent tree, leading to data loss/corruption if
the fsync ends up using the same transaction used by the clone operation
(that transaction was not committed in the meanwhile). An extent map is
released through the callback btrfs_invalidatepage(), which gets called by
truncate_inode_pages_range(), and it calls __btrfs_releasepage(). The
later ends up calling try_release_extent_mapping() which will release the
extent map if some conditions are met, like the file size being greater
than 16Mb, gfp flags allow blocking and the range not being locked (which
is the case during the clone operation) nor being the extent map flagged
as pinned (also the case for cloning).
The following example, turned into a test for fstests, reproduces the
issue:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ xfs_io -f -c "pwrite -S 0x18 9000K 6908K" /mnt/foo
$ xfs_io -f -c "pwrite -S 0x20 2572K 156K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
# reflink destination offset corresponds to the size of file bar,
# 2728Kb minus 4Kb.
$ xfs_io -c ""reflink ${SCRATCH_MNT}/foo 0 2724K 15908K" /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
$ md5sum /mnt/bar
95a95813a8c2abc9aa75a6c2914a077e /mnt/bar
<power fail>
$ mount /dev/sdb /mnt
$ md5sum /mnt/bar
207fd8d0b161be8a84b945f0df8d5f8d /mnt/bar
# digest should be 95a95813a8c2abc9aa75a6c2914a077e like before the
# power failure
In the above example, the destination offset of the clone operation
corresponds to the size of the "bar" file minus 4Kb. So during the clone
operation, the extent map covering the range from 2572Kb to 2728Kb gets
trimmed so that it ends at offset 2724Kb, and a new extent map covering
the range from 2724Kb to 11724Kb is created. So at the end of the clone
operation when we ask to truncate the pages in the range from 2724Kb to
2724Kb + 15908Kb, the page invalidation callback ends up removing the new
extent map (through try_release_extent_mapping()) when the page at offset
2724Kb is passed to that callback.
Fix this by setting the bit BTRFS_INODE_NEEDS_FULL_SYNC whenever an extent
map is removed at try_release_extent_mapping(), forcing the next fsync to
search for modified extents in the fs/subvolume tree instead of relying on
the presence of extent maps in memory. This way we can continue doing a
"fast" fsync if the destination range of a clone operation does not
overlap with an existing range or if any of the criteria necessary to
remove an extent map at try_release_extent_mapping() is not met (file
size not bigger then 16Mb or gfp flags do not allow blocking).
CC: stable(a)vger.kernel.org # 3.16+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 1aa91d57404a..8fd86e29085f 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4241,8 +4241,9 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
struct extent_map *em;
u64 start = page_offset(page);
u64 end = start + PAGE_SIZE - 1;
- struct extent_io_tree *tree = &BTRFS_I(page->mapping->host)->io_tree;
- struct extent_map_tree *map = &BTRFS_I(page->mapping->host)->extent_tree;
+ struct btrfs_inode *btrfs_inode = BTRFS_I(page->mapping->host);
+ struct extent_io_tree *tree = &btrfs_inode->io_tree;
+ struct extent_map_tree *map = &btrfs_inode->extent_tree;
if (gfpflags_allow_blocking(mask) &&
page->mapping->host->i_size > SZ_16M) {
@@ -4265,6 +4266,8 @@ int try_release_extent_mapping(struct page *page, gfp_t mask)
extent_map_end(em) - 1,
EXTENT_LOCKED | EXTENT_WRITEBACK,
0, NULL)) {
+ set_bit(BTRFS_INODE_NEEDS_FULL_SYNC,
+ &btrfs_inode->runtime_flags);
remove_extent_mapping(map, em);
/* once for the rb tree */
free_extent_map(em);
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 95d6c0857e54b788982746071130d822a795026b Mon Sep 17 00:00:00 2001
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Wed, 18 Jul 2018 13:38:37 +0200
Subject: [PATCH] cpufreq: intel_pstate: Register when ACPI PCCH is present
Currently, intel_pstate doesn't register if _PSS is not present on
HP Proliant systems, because it expects the firmware to take over
CPU performance scaling in that case. However, if ACPI PCCH is
present, the firmware expects the kernel to use it for CPU
performance scaling and the pcc-cpufreq driver is loaded for that.
Unfortunately, the firmware interface used by that driver is not
scalable for fundamental reasons, so pcc-cpufreq is way suboptimal
on systems with more than just a few CPUs. In fact, it is better to
avoid using it at all.
For this reason, modify intel_pstate to look for ACPI PCCH if _PSS
is not present and register if it is there. Also prevent the
pcc-cpufreq driver from trying to initialize itself if intel_pstate
has been registered already.
Fixes: fbbcdc0744da (intel_pstate: skip the driver if ACPI has power mgmt option)
Reported-by: Andreas Herrmann <aherrmann(a)suse.com>
Reviewed-by: Andreas Herrmann <aherrmann(a)suse.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
Tested-by: Andreas Herrmann <aherrmann(a)suse.com>
Cc: 4.16+ <stable(a)vger.kernel.org> # 4.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index ece120da3353..3c3971256130 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -2394,6 +2394,18 @@ static bool __init intel_pstate_no_acpi_pss(void)
return true;
}
+static bool __init intel_pstate_no_acpi_pcch(void)
+{
+ acpi_status status;
+ acpi_handle handle;
+
+ status = acpi_get_handle(NULL, "\\_SB", &handle);
+ if (ACPI_FAILURE(status))
+ return true;
+
+ return !acpi_has_method(handle, "PCCH");
+}
+
static bool __init intel_pstate_has_acpi_ppc(void)
{
int i;
@@ -2453,7 +2465,10 @@ static bool __init intel_pstate_platform_pwr_mgmt_exists(void)
switch (plat_info[idx].data) {
case PSS:
- return intel_pstate_no_acpi_pss();
+ if (!intel_pstate_no_acpi_pss())
+ return false;
+
+ return intel_pstate_no_acpi_pcch();
case PPC:
return intel_pstate_has_acpi_ppc() && !force_load;
}
diff --git a/drivers/cpufreq/pcc-cpufreq.c b/drivers/cpufreq/pcc-cpufreq.c
index 3f0ce2ae35ee..0c56c9759672 100644
--- a/drivers/cpufreq/pcc-cpufreq.c
+++ b/drivers/cpufreq/pcc-cpufreq.c
@@ -580,6 +580,10 @@ static int __init pcc_cpufreq_init(void)
{
int ret;
+ /* Skip initialization if another cpufreq driver is there. */
+ if (cpufreq_get_current_driver())
+ return 0;
+
if (acpi_disabled)
return 0;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 95d6c0857e54b788982746071130d822a795026b Mon Sep 17 00:00:00 2001
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Wed, 18 Jul 2018 13:38:37 +0200
Subject: [PATCH] cpufreq: intel_pstate: Register when ACPI PCCH is present
Currently, intel_pstate doesn't register if _PSS is not present on
HP Proliant systems, because it expects the firmware to take over
CPU performance scaling in that case. However, if ACPI PCCH is
present, the firmware expects the kernel to use it for CPU
performance scaling and the pcc-cpufreq driver is loaded for that.
Unfortunately, the firmware interface used by that driver is not
scalable for fundamental reasons, so pcc-cpufreq is way suboptimal
on systems with more than just a few CPUs. In fact, it is better to
avoid using it at all.
For this reason, modify intel_pstate to look for ACPI PCCH if _PSS
is not present and register if it is there. Also prevent the
pcc-cpufreq driver from trying to initialize itself if intel_pstate
has been registered already.
Fixes: fbbcdc0744da (intel_pstate: skip the driver if ACPI has power mgmt option)
Reported-by: Andreas Herrmann <aherrmann(a)suse.com>
Reviewed-by: Andreas Herrmann <aherrmann(a)suse.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
Tested-by: Andreas Herrmann <aherrmann(a)suse.com>
Cc: 4.16+ <stable(a)vger.kernel.org> # 4.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index ece120da3353..3c3971256130 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -2394,6 +2394,18 @@ static bool __init intel_pstate_no_acpi_pss(void)
return true;
}
+static bool __init intel_pstate_no_acpi_pcch(void)
+{
+ acpi_status status;
+ acpi_handle handle;
+
+ status = acpi_get_handle(NULL, "\\_SB", &handle);
+ if (ACPI_FAILURE(status))
+ return true;
+
+ return !acpi_has_method(handle, "PCCH");
+}
+
static bool __init intel_pstate_has_acpi_ppc(void)
{
int i;
@@ -2453,7 +2465,10 @@ static bool __init intel_pstate_platform_pwr_mgmt_exists(void)
switch (plat_info[idx].data) {
case PSS:
- return intel_pstate_no_acpi_pss();
+ if (!intel_pstate_no_acpi_pss())
+ return false;
+
+ return intel_pstate_no_acpi_pcch();
case PPC:
return intel_pstate_has_acpi_ppc() && !force_load;
}
diff --git a/drivers/cpufreq/pcc-cpufreq.c b/drivers/cpufreq/pcc-cpufreq.c
index 3f0ce2ae35ee..0c56c9759672 100644
--- a/drivers/cpufreq/pcc-cpufreq.c
+++ b/drivers/cpufreq/pcc-cpufreq.c
@@ -580,6 +580,10 @@ static int __init pcc_cpufreq_init(void)
{
int ret;
+ /* Skip initialization if another cpufreq driver is there. */
+ if (cpufreq_get_current_driver())
+ return 0;
+
if (acpi_disabled)
return 0;
Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle")
allows the writeback rate to be faster if there is no I/O request on a
bcache device. It works well if there is only one bcache device attached
to the cache set. If there are many bcache devices attached to a cache
set, it may introduce performance regression because multiple faster
writeback threads of the idle bcache devices will compete the btree level
locks with the bcache device who have I/O requests coming.
This patch fixes the above issue by only permitting fast writebac when
all bcache devices attached on the cache set are idle. And if one of the
bcache devices has new I/O request coming, minimized all writeback
throughput immediately and let PI controller __update_writeback_rate()
to decide the upcoming writeback rate for each bcache device.
Also when all bcache devices are idle, limited wrieback rate to a small
number is wast of thoughput, especially when backing devices are slower
non-rotation devices (e.g. SATA SSD). This patch sets a max writeback
rate for each backing device if the whole cache set is idle. A faster
writeback rate in idle time means new I/Os may have more available space
for dirty data, and people may observe a better write performance then.
Please note bcache may change its cache mode in run time, and this patch
still works if the cache mode is switched from writeback mode and there
is still dirty data on cache.
Fixes: Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle")
Cc: stable(a)vger.kernel.org #4.16+
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Michael Lyle <mlyle(a)lyle.org>
---
drivers/md/bcache/bcache.h | 9 +---
drivers/md/bcache/request.c | 42 ++++++++++++++-
drivers/md/bcache/sysfs.c | 14 +++--
drivers/md/bcache/util.c | 2 +-
drivers/md/bcache/util.h | 2 +-
drivers/md/bcache/writeback.c | 98 +++++++++++++++++++++++++----------
6 files changed, 126 insertions(+), 41 deletions(-)
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index d6bf294f3907..f7451e8be03c 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -328,13 +328,6 @@ struct cached_dev {
*/
atomic_t has_dirty;
- /*
- * Set to zero by things that touch the backing volume-- except
- * writeback. Incremented by writeback. Used to determine when to
- * accelerate idle writeback.
- */
- atomic_t backing_idle;
-
struct bch_ratelimit writeback_rate;
struct delayed_work writeback_rate_update;
@@ -514,6 +507,8 @@ struct cache_set {
struct cache_accounting accounting;
unsigned long flags;
+ atomic_t idle_counter;
+ atomic_t at_max_writeback_rate;
struct cache_sb sb;
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index ae67f5fa8047..fe45f561a054 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -1104,6 +1104,34 @@ static void detached_dev_do_request(struct bcache_device *d, struct bio *bio)
/* Cached devices - read & write stuff */
+static void quit_max_writeback_rate(struct cache_set *c)
+{
+ int i;
+ struct bcache_device *d;
+ struct cached_dev *dc;
+
+ mutex_lock(&bch_register_lock);
+
+ for (i = 0; i < c->devices_max_used; i++) {
+ if (!c->devices[i])
+ continue;
+
+ if (UUID_FLASH_ONLY(&c->uuids[i]))
+ continue;
+
+ d = c->devices[i];
+ dc = container_of(d, struct cached_dev, disk);
+ /*
+ * set writeback rate to default minimum value,
+ * then let update_writeback_rate() to decide the
+ * upcoming rate.
+ */
+ atomic64_set(&dc->writeback_rate.rate, 1);
+ }
+
+ mutex_unlock(&bch_register_lock);
+}
+
static blk_qc_t cached_dev_make_request(struct request_queue *q,
struct bio *bio)
{
@@ -1119,7 +1147,19 @@ static blk_qc_t cached_dev_make_request(struct request_queue *q,
return BLK_QC_T_NONE;
}
- atomic_set(&dc->backing_idle, 0);
+ if (d->c) {
+ atomic_set(&d->c->idle_counter, 0);
+ /*
+ * If at_max_writeback_rate of cache set is true and new I/O
+ * comes, quit max writeback rate of all cached devices
+ * attached to this cache set, and set at_max_writeback_rate
+ * to false.
+ */
+ if (unlikely(atomic_read(&d->c->at_max_writeback_rate) == 1)) {
+ atomic_set(&d->c->at_max_writeback_rate, 0);
+ quit_max_writeback_rate(d->c);
+ }
+ }
generic_start_io_acct(q, rw, bio_sectors(bio), &d->disk->part0);
bio_set_dev(bio, dc->bdev);
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 225b15aa0340..d719021bff81 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -170,7 +170,8 @@ SHOW(__bch_cached_dev)
var_printf(writeback_running, "%i");
var_print(writeback_delay);
var_print(writeback_percent);
- sysfs_hprint(writeback_rate, dc->writeback_rate.rate << 9);
+ sysfs_hprint(writeback_rate,
+ atomic64_read(&dc->writeback_rate.rate) << 9);
sysfs_hprint(io_errors, atomic_read(&dc->io_errors));
sysfs_printf(io_error_limit, "%i", dc->error_limit);
sysfs_printf(io_disable, "%i", dc->io_disable);
@@ -188,7 +189,8 @@ SHOW(__bch_cached_dev)
char change[20];
s64 next_io;
- bch_hprint(rate, dc->writeback_rate.rate << 9);
+ bch_hprint(rate,
+ atomic64_read(&dc->writeback_rate.rate) << 9);
bch_hprint(dirty, bcache_dev_sectors_dirty(&dc->disk) << 9);
bch_hprint(target, dc->writeback_rate_target << 9);
bch_hprint(proportional,dc->writeback_rate_proportional << 9);
@@ -255,8 +257,12 @@ STORE(__cached_dev)
sysfs_strtoul_clamp(writeback_percent, dc->writeback_percent, 0, 40);
- sysfs_strtoul_clamp(writeback_rate,
- dc->writeback_rate.rate, 1, INT_MAX);
+ if (attr == &sysfs_writeback_rate) {
+ int v;
+
+ sysfs_strtoul_clamp(writeback_rate, v, 1, INT_MAX);
+ atomic64_set(&dc->writeback_rate.rate, v);
+ }
sysfs_strtoul_clamp(writeback_rate_update_seconds,
dc->writeback_rate_update_seconds,
diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
index fc479b026d6d..84f90c3d996d 100644
--- a/drivers/md/bcache/util.c
+++ b/drivers/md/bcache/util.c
@@ -200,7 +200,7 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done)
{
uint64_t now = local_clock();
- d->next += div_u64(done * NSEC_PER_SEC, d->rate);
+ d->next += div_u64(done * NSEC_PER_SEC, atomic64_read(&d->rate));
/* Bound the time. Don't let us fall further than 2 seconds behind
* (this prevents unnecessary backlog that would make it impossible
diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h
index cced87f8eb27..7e17f32ab563 100644
--- a/drivers/md/bcache/util.h
+++ b/drivers/md/bcache/util.h
@@ -442,7 +442,7 @@ struct bch_ratelimit {
* Rate at which we want to do work, in units per second
* The units here correspond to the units passed to bch_next_delay()
*/
- uint32_t rate;
+ atomic64_t rate;
};
static inline void bch_ratelimit_reset(struct bch_ratelimit *d)
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index ad45ebe1a74b..72059f910230 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -49,6 +49,63 @@ static uint64_t __calc_target_rate(struct cached_dev *dc)
return (cache_dirty_target * bdev_share) >> WRITEBACK_SHARE_SHIFT;
}
+static bool set_at_max_writeback_rate(struct cache_set *c,
+ struct cached_dev *dc)
+{
+ int i, dirty_dc_nr = 0;
+ struct bcache_device *d;
+
+ mutex_lock(&bch_register_lock);
+ for (i = 0; i < c->devices_max_used; i++) {
+ if (!c->devices[i])
+ continue;
+ if (UUID_FLASH_ONLY(&c->uuids[i]))
+ continue;
+ d = c->devices[i];
+ dc = container_of(d, struct cached_dev, disk);
+ if (atomic_read(&dc->has_dirty))
+ dirty_dc_nr++;
+ }
+ mutex_unlock(&bch_register_lock);
+
+ /*
+ * Idle_counter is increased everytime when update_writeback_rate()
+ * is rescheduled in. If all backing devices attached to the same
+ * cache set has same dc->writeback_rate_update_seconds value, it
+ * is about 10 rounds of update_writeback_rate() is called on each
+ * backing device, then the code will fall through at set 1 to
+ * c->at_max_writeback_rate, and a max wrteback rate to each
+ * dc->writeback_rate.rate. This is not very accurate but works well
+ * to make sure the whole cache set has no new I/O coming before
+ * writeback rate is set to a max number.
+ */
+ if (atomic_inc_return(&c->idle_counter) < dirty_dc_nr * 10)
+ return false;
+
+ if (atomic_read(&c->at_max_writeback_rate) != 1)
+ atomic_set(&c->at_max_writeback_rate, 1);
+
+
+ atomic64_set(&dc->writeback_rate.rate, INT_MAX);
+
+ /* keep writeback_rate_target as existing value */
+ dc->writeback_rate_proportional = 0;
+ dc->writeback_rate_integral_scaled = 0;
+ dc->writeback_rate_change = 0;
+
+ /*
+ * Check c->idle_counter and c->at_max_writeback_rate agagain in case
+ * new I/O arrives during before set_at_max_writeback_rate() returns.
+ * Then the writeback rate is set to 1, and its new value should be
+ * decided via __update_writeback_rate().
+ */
+ if (atomic_read(&c->idle_counter) < dirty_dc_nr * 10 ||
+ !atomic_read(&c->at_max_writeback_rate))
+ return false;
+
+ return true;
+}
+
static void __update_writeback_rate(struct cached_dev *dc)
{
/*
@@ -104,8 +161,9 @@ static void __update_writeback_rate(struct cached_dev *dc)
dc->writeback_rate_proportional = proportional_scaled;
dc->writeback_rate_integral_scaled = integral_scaled;
- dc->writeback_rate_change = new_rate - dc->writeback_rate.rate;
- dc->writeback_rate.rate = new_rate;
+ dc->writeback_rate_change = new_rate -
+ atomic64_read(&dc->writeback_rate.rate);
+ atomic64_set(&dc->writeback_rate.rate, new_rate);
dc->writeback_rate_target = target;
}
@@ -138,9 +196,16 @@ static void update_writeback_rate(struct work_struct *work)
down_read(&dc->writeback_lock);
- if (atomic_read(&dc->has_dirty) &&
- dc->writeback_percent)
- __update_writeback_rate(dc);
+ if (atomic_read(&dc->has_dirty) && dc->writeback_percent) {
+ /*
+ * If the whole cache set is idle, set_at_max_writeback_rate()
+ * will set writeback rate to a max number. Then it is
+ * unncessary to update writeback rate for an idle cache set
+ * in maximum writeback rate number(s).
+ */
+ if (!set_at_max_writeback_rate(c, dc))
+ __update_writeback_rate(dc);
+ }
up_read(&dc->writeback_lock);
@@ -422,27 +487,6 @@ static void read_dirty(struct cached_dev *dc)
delay = writeback_delay(dc, size);
- /* If the control system would wait for at least half a
- * second, and there's been no reqs hitting the backing disk
- * for awhile: use an alternate mode where we have at most
- * one contiguous set of writebacks in flight at a time. If
- * someone wants to do IO it will be quick, as it will only
- * have to contend with one operation in flight, and we'll
- * be round-tripping data to the backing disk as quickly as
- * it can accept it.
- */
- if (delay >= HZ / 2) {
- /* 3 means at least 1.5 seconds, up to 7.5 if we
- * have slowed way down.
- */
- if (atomic_inc_return(&dc->backing_idle) >= 3) {
- /* Wait for current I/Os to finish */
- closure_sync(&cl);
- /* And immediately launch a new set. */
- delay = 0;
- }
- }
-
while (!kthread_should_stop() &&
!test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
delay) {
@@ -715,7 +759,7 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc)
dc->writeback_running = true;
dc->writeback_percent = 10;
dc->writeback_delay = 30;
- dc->writeback_rate.rate = 1024;
+ atomic64_set(&dc->writeback_rate.rate, 1024);
dc->writeback_rate_minimum = 8;
dc->writeback_rate_update_seconds = WRITEBACK_RATE_UPDATE_SECS_DEFAULT;
--
2.17.1
This is the start of the stable review cycle for the 3.18.116 release.
There are 29 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun Jul 22 11:51:47 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.116-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 3.18.116-rc1
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.
Santosh Shilimkar <santosh.shilimkar(a)oracle.com>
rds: avoid unenecessary cong_update in loop transport
Eric Biggers <ebiggers(a)google.com>
KEYS: DNS: fix parsing multiple options
Florian Westphal <fw(a)strlen.de>
netfilter: ebtables: reject non-bridge targets
Alex Vesker <valex(a)mellanox.com>
net/mlx5: Fix command interface race in polling mode
Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru>
net_sched: blackhole: tell upper qdisc about dropped packets
Jason Wang <jasowang(a)redhat.com>
vhost_net: validate sock before trying to put its fd
Ilpo Järvinen <ilpo.jarvinen(a)helsinki.fi>
tcp: prevent bogus FRTO undos with non-SACK flows
Yuchung Cheng <ycheng(a)google.com>
tcp: fix Fast Open key endianness
Eric Dumazet <edumazet(a)google.com>
net: sungem: fix rx checksum support
Alex Vesker <valex(a)mellanox.com>
net/mlx5: Fix incorrect raw command length parsing
Eric Dumazet <edumazet(a)google.com>
net: dccp: switch rx_tstamp_last_feedback to monotonic clock
Eric Dumazet <edumazet(a)google.com>
net: dccp: avoid crash in ccid3_hc_rx_send_feedback()
Christian Lamparter <chunkeey(a)googlemail.com>
crypto: crypto4xx - fix crypto4xx_build_pdr, crypto4xx_build_sdr leak
Christian Lamparter <chunkeey(a)googlemail.com>
crypto: crypto4xx - remove bad list_del
Jonas Gorski <jonas.gorski(a)gmail.com>
bcm63xx_enet: do not write to random DMA channel on BCM6345
Jonas Gorski <jonas.gorski(a)gmail.com>
bcm63xx_enet: correct clock usage
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
loop: remember whether sysfs_create_group() was done
Leon Romanovsky <leonro(a)mellanox.com>
RDMA/ucm: Mark UCM interface as BROKEN
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
PM / hibernate: Fix oops at snapshot_write()
Theodore Ts'o <tytso(a)mit.edu>
loop: add recursion validation to LOOP_CHANGE_FD
Florian Westphal <fw(a)strlen.de>
netfilter: x_tables: initialise match/target check parameter struct
Linus Torvalds <torvalds(a)linux-foundation.org>
Fix up non-directory creation in SGID directories
Dan Carpenter <dan.carpenter(a)oracle.com>
xhci: xhci-mem: off by one in xhci_stream_id_to_ring()
Nico Sneck <snecknico(a)gmail.com>
usb: quirks: add delay quirks for Corsair Strafe
Johan Hovold <johan(a)kernel.org>
USB: serial: mos7840: fix status-register error handling
Jann Horn <jannh(a)google.com>
USB: yurex: fix out-of-bounds uaccess in read handler
Johan Hovold <johan(a)kernel.org>
USB: serial: keyspan_pda: fix modem-status error handling
Jann Horn <jannh(a)google.com>
ibmasm: don't write out of bounds in read handler
-------------
Diffstat:
Makefile | 4 +-
drivers/block/loop.c | 79 +++++++++++++++------------
drivers/block/loop.h | 1 +
drivers/crypto/amcc/crypto4xx_core.c | 23 ++++----
drivers/infiniband/Kconfig | 12 ++++
drivers/infiniband/core/Makefile | 4 +-
drivers/misc/ibmasm/ibmasmfs.c | 27 +--------
drivers/net/ethernet/broadcom/bcm63xx_enet.c | 34 +++++++++---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 8 +--
drivers/net/ethernet/sun/sungem.c | 22 ++++----
drivers/usb/core/quirks.c | 4 ++
drivers/usb/host/xhci-mem.c | 2 +-
drivers/usb/misc/yurex.c | 23 ++------
drivers/usb/serial/keyspan_pda.c | 4 +-
drivers/usb/serial/mos7840.c | 3 +
drivers/vhost/net.c | 3 +-
fs/inode.c | 6 ++
kernel/power/user.c | 5 ++
net/bridge/netfilter/ebtables.c | 15 +++++
net/dccp/ccids/ccid3.c | 16 +++---
net/dns_resolver/dns_key.c | 28 ++++++----
net/ipv4/netfilter/ip_tables.c | 1 +
net/ipv4/sysctl_net_ipv4.c | 18 ++++--
net/ipv4/tcp_input.c | 9 +++
net/ipv6/netfilter/ip6_tables.c | 1 +
net/nfc/llcp_commands.c | 9 ++-
net/rds/loop.c | 1 +
net/rds/rds.h | 5 ++
net/rds/recv.c | 5 ++
net/sched/sch_blackhole.c | 2 +-
30 files changed, 228 insertions(+), 146 deletions(-)
Greetings,
I represent business group in Middle East looking for projects to
fund; we seek any business that will guaranty a safe and secure
return
on investments. Alternative powers, movies, start up companies
etc. We
are also looking for commercial building projects, hotels,
casino,
strip malls etc.
If you have a project that differs from these and current budget
is
over $40M, Write us and Provide Executive Summary and project
details.
(a)100% financing is available.
(b)Possible JV partnerships or 100% loans.
(c)Project must be over $20M total budget.
I look forward to your reply,
Sir Abraham Oded
Sahara Petrochems Ltd
odoabraham47(a)gmail.com
For some time now, if you load the bonding driver and configure bond
parameters via sysfs using minimal config options, such as specifying
nothing but the mode, relying on defaults for everything else, modes
that cannot use arp monitoring (802.3ad, balance-tlb, balance-alb) all
wind up with both arp_interval=0 (as it should be) and miimon=0, which
means the miimon monitor thread never actually runs. This is particularly
problematic for 802.3ad.
For example, from an LNST recipe I've set up:
$ modprobe bonding max_bonds=0"
$ echo "+t_bond0" > /sys/class/net/bonding_masters"
$ ip link set t_bond0 down"
$ echo "802.3ad" > /sys/class/net/t_bond0/bonding/mode"
$ ip link set ens1f1 down"
$ echo "+ens1f1" > /sys/class/net/t_bond0/bonding/slaves"
$ ip link set ens1f0 down"
$ echo "+ens1f0" > /sys/class/net/t_bond0/bonding/slaves"
$ ethtool -i t_bond0"
$ ip link set ens1f1 up"
$ ip link set ens1f0 up"
$ ip link set t_bond0 up"
$ ip addr add 192.168.9.1/24 dev t_bond0"
$ ip addr add 2002::1/64 dev t_bond0"
This bond comes up okay, but things look slightly suspect in
/proc/net/bonding/t_bond0 output:
$ grep -i mii /proc/net/bonding/t_bond0
MII Status: up
MII Polling Interval (ms): 0
MII Status: up
MII Status: up
Now, pull a cable on one of the ports in the bond, then reconnect it, and
you'll see:
Slave Interface: ens1f0
MII Status: down
Speed: 1000 Mbps
Duplex: full
I believe this became a major issue as of commit 4d2c0cda0744, which for
802.3ad bonds, sets slave->link = BOND_LINK_DOWN, with a comment about
relying on link monitoring via miimon to set it correctly, but since the
miimon work queue never runs, the link just stays marked down.
If we simply tweak bond_option_mode_set() slightly, we can check for the
non-arp modes having no miimon value set, and insert BOND_DEFAULT_MIIMON,
which gets things back in full working order. This problem exists as far
back as 4.14, and might be worth fixing in all stable trees since, though
the work-around is to simply specify an miimon value yourself.
Reported-by: Bob Ball <ball(a)umich.edu>
CC: Jay Vosburgh <j.vosburgh(a)gmail.com>
CC: Veaceslav Falico <vfalico(a)gmail.com>
CC: Andy Gospodarek <andy(a)greyhouse.net>
CC: Mahesh Bandewar <maheshb(a)google.com>
CC: David S. Miller <davem(a)davemloft.net>
CC: netdev(a)vger.kernel.org
CC: stable(a)vger.kernel.org
Signed-off-by: Jarod Wilson <jarod(a)redhat.com>
---
drivers/net/bonding/bond_options.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 98663c50ded0..4d5d01cb8141 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -743,15 +743,20 @@ const struct bond_option *bond_opt_get(unsigned int option)
static int bond_option_mode_set(struct bonding *bond,
const struct bond_opt_value *newval)
{
- if (!bond_mode_uses_arp(newval->value) && bond->params.arp_interval) {
- netdev_dbg(bond->dev, "%s mode is incompatible with arp monitoring, start mii monitoring\n",
- newval->string);
- /* disable arp monitoring */
- bond->params.arp_interval = 0;
- /* set miimon to default value */
- bond->params.miimon = BOND_DEFAULT_MIIMON;
- netdev_dbg(bond->dev, "Setting MII monitoring interval to %d\n",
- bond->params.miimon);
+ if (!bond_mode_uses_arp(newval->value)) {
+ if (bond->params.arp_interval) {
+ netdev_dbg(bond->dev, "%s mode is incompatible with arp monitoring, start mii monitoring\n",
+ newval->string);
+ /* disable arp monitoring */
+ bond->params.arp_interval = 0;
+ }
+
+ if (!bond->params.miimon) {
+ /* set miimon to default value */
+ bond->params.miimon = BOND_DEFAULT_MIIMON;
+ netdev_dbg(bond->dev, "Setting MII monitoring interval to %d\n",
+ bond->params.miimon);
+ }
}
if (newval->value == BOND_MODE_ALB)
--
2.16.1
This is the start of the stable review cycle for the 4.9.114 release.
There are 66 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun Jul 22 12:13:47 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.114-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.114-rc1
Tejun Heo <tj(a)kernel.org>
string: drop __must_check from strscpy() and restore strscpy() usages in cgroup
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: KVM: Add ARCH_WORKAROUND_2 discovery through ARCH_FEATURES_FUNC_ID
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requests
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: KVM: Add ARCH_WORKAROUND_2 support for guests
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: KVM: Add HYP per-cpu accessors
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: ssbd: Add prctl interface for per-thread mitigation
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: ssbd: Introduce thread flag to control userspace mitigation
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: ssbd: Restore mitigation status on CPU resume
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: ssbd: Skip apply_ssbd if not using dynamic mitigation
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: ssbd: Add global mitigation state accessor
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: Add 'ssbd' command-line option
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: Add ARCH_WORKAROUND_2 probing
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1
Marc Zyngier <marc.zyngier(a)arm.com>
arm/arm64: smccc: Add SMCCC-specific return codes
Christoffer Dall <christoffer.dall(a)linaro.org>
KVM: arm64: Avoid storing the vcpu pointer on the stack
Marc Zyngier <marc.zyngier(a)arm.com>
KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
Marc Zyngier <marc.zyngier(a)arm.com>
arm64: alternatives: Add dynamic patching feature
James Morse <james.morse(a)arm.com>
KVM: arm64: Stop save/restoring host tpidr_el1 on VHE
James Morse <james.morse(a)arm.com>
arm64: alternatives: use tpidr_el2 on VHE hosts
James Morse <james.morse(a)arm.com>
KVM: arm64: Change hyp_panic()s dependency on tpidr_el2
James Morse <james.morse(a)arm.com>
KVM: arm/arm64: Convert kvm_host_cpu_state to a static per-cpu allocation
James Morse <james.morse(a)arm.com>
KVM: arm64: Store vcpu on the stack during __guest_enter()
Mark Rutland <mark.rutland(a)arm.com>
arm64: assembler: introduce ldr_this_cpu
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.
Santosh Shilimkar <santosh.shilimkar(a)oracle.com>
rds: avoid unenecessary cong_update in loop transport
Florian Westphal <fw(a)strlen.de>
netfilter: ipv6: nf_defrag: drop skb dst before queueing
Eric Biggers <ebiggers(a)google.com>
KEYS: DNS: fix parsing multiple options
Eric Biggers <ebiggers(a)google.com>
reiserfs: fix buffer overflow with long warning messages
Florian Westphal <fw(a)strlen.de>
netfilter: ebtables: reject non-bridge targets
Stefan Wahren <stefan.wahren(a)i2se.com>
net: lan78xx: Fix race in tx pending skb size calculation
Ping-Ke Shih <pkshih(a)realtek.com>
rtlwifi: rtl8821ae: fix firmware is not ready to run
Gustavo A. R. Silva <gustavo(a)embeddedor.com>
net: cxgb3_main: fix potential Spectre v1
Alex Vesker <valex(a)mellanox.com>
net/mlx5: Fix command interface race in polling mode
Eric Dumazet <edumazet(a)google.com>
net/packet: fix use-after-free
Jason Wang <jasowang(a)redhat.com>
vhost_net: validate sock before trying to put its fd
Ilpo Järvinen <ilpo.jarvinen(a)helsinki.fi>
tcp: prevent bogus FRTO undos with non-SACK flows
Yuchung Cheng <ycheng(a)google.com>
tcp: fix Fast Open key endianness
Jiri Slaby <jslaby(a)suse.cz>
r8152: napi hangup fix after disconnect
Aleksander Morgado <aleksander(a)aleksander.es>
qmi_wwan: add support for the Dell Wireless 5821e module
Sudarsana Reddy Kalluru <sudarsana.kalluru(a)cavium.com>
qed: Limit msix vectors in kdump kernel to the minimum required count.
Sudarsana Reddy Kalluru <sudarsana.kalluru(a)cavium.com>
qed: Fix use of incorrect size in memcpy call.
Eric Dumazet <edumazet(a)google.com>
net: sungem: fix rx checksum support
Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru>
net_sched: blackhole: tell upper qdisc about dropped packets
Shay Agroskin <shayag(a)mellanox.com>
net/mlx5: Fix wrong size allocation for QoS ETC TC regitster
Alex Vesker <valex(a)mellanox.com>
net/mlx5: Fix incorrect raw command length parsing
Eric Dumazet <edumazet(a)google.com>
net: dccp: switch rx_tstamp_last_feedback to monotonic clock
Eric Dumazet <edumazet(a)google.com>
net: dccp: avoid crash in ccid3_hc_rx_send_feedback()
Xin Long <lucien.xin(a)gmail.com>
ipvlan: fix IFLA_MTU ignored on NEWLINK
Gustavo A. R. Silva <gustavo(a)embeddedor.com>
atm: zatm: Fix potential Spectre v1
Christian Lamparter <chunkeey(a)googlemail.com>
crypto: crypto4xx - fix crypto4xx_build_pdr, crypto4xx_build_sdr leak
Christian Lamparter <chunkeey(a)googlemail.com>
crypto: crypto4xx - remove bad list_del
Jonas Gorski <jonas.gorski(a)gmail.com>
bcm63xx_enet: do not write to random DMA channel on BCM6345
Jonas Gorski <jonas.gorski(a)gmail.com>
bcm63xx_enet: correct clock usage
Jonas Gorski <jonas.gorski(a)gmail.com>
spi/bcm63xx: fix typo in bcm63xx_spi_max_length breaking compilation
Jonas Gorski <jonas.gorski(a)gmail.com>
spi/bcm63xx: make spi subsystem aware of message size limits
Heiner Kallweit <hkallweit1(a)gmail.com>
mtd: m25p80: consider max message size in m25p80_read
alex chen <alex.chen(a)huawei.com>
ocfs2: ip_alloc_sem should be taken in ocfs2_get_block()
alex chen <alex.chen(a)huawei.com>
ocfs2: subsystem.su_mutex is required while accessing the item->ci_parent
Nick Desaulniers <ndesaulniers(a)google.com>
x86/paravirt: Make native_save_fl() extern inline
H. Peter Anvin <hpa(a)linux.intel.com>
x86/asm: Add _ASM_ARG* constants for argument registers to <asm/asm.h>
Nick Desaulniers <ndesaulniers(a)google.com>
compiler-gcc.h: Add __attribute__((gnu_inline)) to all inline declarations
David Rientjes <rientjes(a)google.com>
compiler, clang: always inline when CONFIG_OPTIMIZE_INLINING is disabled
Linus Torvalds <torvalds(a)linux-foundation.org>
compiler, clang: properly override 'inline' for clang
David Rientjes <rientjes(a)google.com>
compiler, clang: suppress warning for unused static inline functions
Paul Burton <paul.burton(a)mips.com>
MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
-------------
Diffstat:
Documentation/kernel-parameters.txt | 17 +++
Makefile | 4 +-
arch/arm/include/asm/kvm_host.h | 12 ++
arch/arm/include/asm/kvm_mmu.h | 12 ++
arch/arm/kvm/arm.c | 24 ++--
arch/arm/kvm/psci.c | 18 ++-
arch/arm64/Kconfig | 9 ++
arch/arm64/include/asm/alternative.h | 43 +++++-
arch/arm64/include/asm/assembler.h | 27 +++-
arch/arm64/include/asm/cpucaps.h | 3 +-
arch/arm64/include/asm/cpufeature.h | 22 +++
arch/arm64/include/asm/kvm_asm.h | 41 ++++++
arch/arm64/include/asm/kvm_host.h | 43 ++++++
arch/arm64/include/asm/kvm_mmu.h | 44 ++++++
arch/arm64/include/asm/percpu.h | 12 +-
arch/arm64/include/asm/thread_info.h | 1 +
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/alternative.c | 54 ++++---
arch/arm64/kernel/asm-offsets.c | 2 +
arch/arm64/kernel/cpu_errata.c | 180 ++++++++++++++++++++++++
arch/arm64/kernel/cpufeature.c | 17 +++
arch/arm64/kernel/entry.S | 32 ++++-
arch/arm64/kernel/hibernate.c | 11 ++
arch/arm64/kernel/ssbd.c | 108 ++++++++++++++
arch/arm64/kernel/suspend.c | 8 ++
arch/arm64/kvm/hyp-init.S | 4 +
arch/arm64/kvm/hyp/entry.S | 12 +-
arch/arm64/kvm/hyp/hyp-entry.S | 62 ++++++--
arch/arm64/kvm/hyp/switch.c | 64 +++++++--
arch/arm64/kvm/hyp/sysreg-sr.c | 21 +--
arch/arm64/kvm/reset.c | 4 +
arch/mips/kernel/process.c | 45 ++++--
arch/x86/include/asm/asm.h | 59 ++++++++
arch/x86/include/asm/irqflags.h | 2 +-
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/irqflags.S | 26 ++++
drivers/atm/zatm.c | 2 +
drivers/crypto/amcc/crypto4xx_core.c | 23 ++-
drivers/mtd/devices/m25p80.c | 3 +-
drivers/net/ethernet/broadcom/bcm63xx_enet.c | 34 +++--
drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c | 2 +
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 8 +-
drivers/net/ethernet/mellanox/mlx5/core/port.c | 4 +-
drivers/net/ethernet/qlogic/qed/qed_dcbx.c | 8 +-
drivers/net/ethernet/qlogic/qed/qed_main.c | 9 ++
drivers/net/ethernet/sun/sungem.c | 22 +--
drivers/net/ipvlan/ipvlan_main.c | 3 +-
drivers/net/usb/lan78xx.c | 5 +-
drivers/net/usb/qmi_wwan.c | 1 +
drivers/net/usb/r8152.c | 3 +-
drivers/net/wireless/realtek/rtlwifi/core.c | 1 -
drivers/spi/spi-bcm63xx.c | 9 ++
drivers/vhost/net.c | 3 +-
fs/ocfs2/aops.c | 26 ++--
fs/ocfs2/cluster/nodemanager.c | 63 +++++++--
fs/reiserfs/prints.c | 141 +++++++++++--------
include/linux/arm-smccc.h | 10 ++
include/linux/compiler-gcc.h | 35 +++--
include/linux/string.h | 2 +-
net/bridge/netfilter/ebtables.c | 13 ++
net/dccp/ccids/ccid3.c | 16 ++-
net/dns_resolver/dns_key.c | 28 ++--
net/ipv4/sysctl_net_ipv4.c | 18 ++-
net/ipv4/tcp_input.c | 9 ++
net/ipv6/netfilter/nf_conntrack_reasm.c | 2 +
net/nfc/llcp_commands.c | 9 +-
net/packet/af_packet.c | 14 +-
net/rds/loop.c | 1 +
net/rds/rds.h | 5 +
net/rds/recv.c | 5 +
net/sched/sch_blackhole.c | 2 +-
virt/kvm/arm/hyp/vgic-v2-sr.c | 2 +-
72 files changed, 1315 insertions(+), 271 deletions(-)
This is a note to let you know that I've just added the patch titled
pty: fix O_CLOEXEC for TIOCGPTPEER
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 36ecc1481dc8d8c52d43ba18c6b642c1d2fde789 Mon Sep 17 00:00:00 2001
From: Matthijs van Duin <matthijsvanduin(a)gmail.com>
Date: Thu, 19 Jul 2018 10:43:46 +0200
Subject: pty: fix O_CLOEXEC for TIOCGPTPEER
It was being ignored because the flags were not passed to fd allocation.
Fixes: 54ebbfb16034 ("tty: add TIOCGPTPEER ioctl")
Signed-off-by: Matthijs van Duin <matthijsvanduin(a)gmail.com>
Acked-by: Aleksa Sarai <asarai(a)suse.de>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/pty.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
index b0e2c4847a5d..678406e0948b 100644
--- a/drivers/tty/pty.c
+++ b/drivers/tty/pty.c
@@ -625,7 +625,7 @@ int ptm_open_peer(struct file *master, struct tty_struct *tty, int flags)
if (tty->driver != ptm_driver)
return -EIO;
- fd = get_unused_fd_flags(0);
+ fd = get_unused_fd_flags(flags);
if (fd < 0) {
retval = fd;
goto err;
--
2.18.0
This is a note to let you know that I've just added the patch titled
uio: fix wrong return value from uio_mmap()
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From e7de2590f18a272e63732b9d519250d1b522b2c4 Mon Sep 17 00:00:00 2001
From: Hailong Liu <liu.hailong6(a)zte.com.cn>
Date: Fri, 20 Jul 2018 08:31:56 +0800
Subject: uio: fix wrong return value from uio_mmap()
uio_mmap has multiple fail paths to set return value to nonzero then
goto out. However, it always returns *0* from the *out* at end, and
this will mislead callers who check the return value of this function.
Fixes: 57c5f4df0a5a0ee ("uio: fix crash after the device is unregistered")
CC: Xiubo Li <xiubli(a)redhat.com>
Signed-off-by: Hailong Liu <liu.hailong6(a)zte.com.cn>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Jiang Biao <jiang.biao2(a)zte.com.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/uio/uio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index f63967c8e95a..144cf7365288 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -813,7 +813,7 @@ static int uio_mmap(struct file *filep, struct vm_area_struct *vma)
out:
mutex_unlock(&idev->info_lock);
- return 0;
+ return ret;
}
static const struct file_operations uio_fops = {
--
2.18.0
This is a note to let you know that I've just added the patch titled
pty: fix O_CLOEXEC for TIOCGPTPEER
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the tty-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 36ecc1481dc8d8c52d43ba18c6b642c1d2fde789 Mon Sep 17 00:00:00 2001
From: Matthijs van Duin <matthijsvanduin(a)gmail.com>
Date: Thu, 19 Jul 2018 10:43:46 +0200
Subject: pty: fix O_CLOEXEC for TIOCGPTPEER
It was being ignored because the flags were not passed to fd allocation.
Fixes: 54ebbfb16034 ("tty: add TIOCGPTPEER ioctl")
Signed-off-by: Matthijs van Duin <matthijsvanduin(a)gmail.com>
Acked-by: Aleksa Sarai <asarai(a)suse.de>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/pty.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
index b0e2c4847a5d..678406e0948b 100644
--- a/drivers/tty/pty.c
+++ b/drivers/tty/pty.c
@@ -625,7 +625,7 @@ int ptm_open_peer(struct file *master, struct tty_struct *tty, int flags)
if (tty->driver != ptm_driver)
return -EIO;
- fd = get_unused_fd_flags(0);
+ fd = get_unused_fd_flags(flags);
if (fd < 0) {
retval = fd;
goto err;
--
2.18.0
This is a note to let you know that I've just added the patch titled
uio: fix wrong return value from uio_mmap()
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From e7de2590f18a272e63732b9d519250d1b522b2c4 Mon Sep 17 00:00:00 2001
From: Hailong Liu <liu.hailong6(a)zte.com.cn>
Date: Fri, 20 Jul 2018 08:31:56 +0800
Subject: uio: fix wrong return value from uio_mmap()
uio_mmap has multiple fail paths to set return value to nonzero then
goto out. However, it always returns *0* from the *out* at end, and
this will mislead callers who check the return value of this function.
Fixes: 57c5f4df0a5a0ee ("uio: fix crash after the device is unregistered")
CC: Xiubo Li <xiubli(a)redhat.com>
Signed-off-by: Hailong Liu <liu.hailong6(a)zte.com.cn>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Jiang Biao <jiang.biao2(a)zte.com.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/uio/uio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index f63967c8e95a..144cf7365288 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -813,7 +813,7 @@ static int uio_mmap(struct file *filep, struct vm_area_struct *vma)
out:
mutex_unlock(&idev->info_lock);
- return 0;
+ return ret;
}
static const struct file_operations uio_fops = {
--
2.18.0
This is a note to let you know that I've just added the patch titled
usb: core: handle hub C_PORT_OVER_CURRENT condition
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 249a32b7eeb3edb6897dd38f89651a62163ac4ed Mon Sep 17 00:00:00 2001
From: Bin Liu <b-liu(a)ti.com>
Date: Thu, 19 Jul 2018 14:39:37 -0500
Subject: usb: core: handle hub C_PORT_OVER_CURRENT condition
Based on USB2.0 Spec Section 11.12.5,
"If a hub has per-port power switching and per-port current limiting,
an over-current on one port may still cause the power on another port
to fall below specific minimums. In this case, the affected port is
placed in the Power-Off state and C_PORT_OVER_CURRENT is set for the
port, but PORT_OVER_CURRENT is not set."
so let's check C_PORT_OVER_CURRENT too for over current condition.
Fixes: 08d1dec6f405 ("usb:hub set hub->change_bits when over-current happens")
Cc: <stable(a)vger.kernel.org>
Tested-by: Alessandro Antenucci <antenucci(a)korg.it>
Signed-off-by: Bin Liu <b-liu(a)ti.com>
Acked-by: Alan Stern <stern(a)rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/hub.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index fcae521df29b..1fb266809966 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -1142,10 +1142,14 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
if (!udev || udev->state == USB_STATE_NOTATTACHED) {
/* Tell hub_wq to disconnect the device or
- * check for a new connection
+ * check for a new connection or over current condition.
+ * Based on USB2.0 Spec Section 11.12.5,
+ * C_PORT_OVER_CURRENT could be set while
+ * PORT_OVER_CURRENT is not. So check for any of them.
*/
if (udev || (portstatus & USB_PORT_STAT_CONNECTION) ||
- (portstatus & USB_PORT_STAT_OVERCURRENT))
+ (portstatus & USB_PORT_STAT_OVERCURRENT) ||
+ (portchange & USB_PORT_STAT_C_OVERCURRENT))
set_bit(port1, hub->change_bits);
} else if (portstatus & USB_PORT_STAT_ENABLE) {
--
2.18.0
This is a note to let you know that I've just added the patch titled
USB: serial: sierra: fix potential deadlock at close
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From e60870012e5a35b1506d7b376fddfb30e9da0b27 Mon Sep 17 00:00:00 2001
From: John Ogness <john.ogness(a)linutronix.de>
Date: Sun, 24 Jun 2018 00:32:11 +0200
Subject: USB: serial: sierra: fix potential deadlock at close
The portdata spinlock can be taken in interrupt context (via
sierra_outdat_callback()).
Disable interrupts when taking the portdata spinlock when discarding
deferred URBs during close to prevent a possible deadlock.
Fixes: 014333f77c0b ("USB: sierra: fix urb and memory leak on disconnect")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: John Ogness <john.ogness(a)linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
[ johan: amend commit message and add fixes and stable tags ]
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/usb/serial/sierra.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/serial/sierra.c b/drivers/usb/serial/sierra.c
index d189f953c891..55956a638f5b 100644
--- a/drivers/usb/serial/sierra.c
+++ b/drivers/usb/serial/sierra.c
@@ -770,9 +770,9 @@ static void sierra_close(struct usb_serial_port *port)
kfree(urb->transfer_buffer);
usb_free_urb(urb);
usb_autopm_put_interface_async(serial->interface);
- spin_lock(&portdata->lock);
+ spin_lock_irq(&portdata->lock);
portdata->outstanding_urbs--;
- spin_unlock(&portdata->lock);
+ spin_unlock_irq(&portdata->lock);
}
sierra_stop_rx_urbs(port);
--
2.18.0
From: Jing Xia <jing.xia.mail(a)gmail.com>
Subject: mm: memcg: fix use after free in mem_cgroup_iter()
It was reported that a kernel crash happened in mem_cgroup_iter(), which
can be triggered if the legacy cgroup-v1 non-hierarchical mode is used.
Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f
......
Call trace:
mem_cgroup_iter+0x2e0/0x6d4
shrink_zone+0x8c/0x324
balance_pgdat+0x450/0x640
kswapd+0x130/0x4b8
kthread+0xe8/0xfc
ret_from_fork+0x10/0x20
mem_cgroup_iter():
......
if (css_tryget(css)) <-- crash here
break;
......
The crashing reason is that mem_cgroup_iter() uses the memcg object whose
pointer is stored in iter->position, which has been freed before and
filled with POISON_FREE(0x6b).
And the root cause of the use-after-free issue is that
invalidate_reclaim_iterators() fails to reset the value of iter->position
to NULL when the css of the memcg is released in non- hierarchical mode.
Link: http://lkml.kernel.org/r/1531994807-25639-1-git-send-email-jing.xia@unisoc.…
Fixes: 6df38689e0e9 ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim")
Signed-off-by: Jing Xia <jing.xia.mail(a)gmail.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Cc: <chunyan.zhang(a)unisoc.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN mm/memcontrol.c~mm-memcg-fix-use-after-free-in-mem_cgroup_iter mm/memcontrol.c
--- a/mm/memcontrol.c~mm-memcg-fix-use-after-free-in-mem_cgroup_iter
+++ a/mm/memcontrol.c
@@ -850,7 +850,7 @@ static void invalidate_reclaim_iterators
int nid;
int i;
- while ((memcg = parent_mem_cgroup(memcg))) {
+ for (; memcg; memcg = parent_mem_cgroup(memcg)) {
for_each_node(nid) {
mz = mem_cgroup_nodeinfo(memcg, nid);
for (i = 0; i <= DEF_PRIORITY; i++) {
_
From: Hugh Dickins <hughd(a)google.com>
Subject: mm/huge_memory.c: fix data loss when splitting a file pmd
__split_huge_pmd_locked() must check if the cleared huge pmd was dirty,
and propagate that to PageDirty: otherwise, data may be lost when a huge
tmpfs page is modified then split then reclaimed.
How has this taken so long to be noticed? Because there was no problem
when the huge page is written by a write system call (shmem_write_end()
calls set_page_dirty()), nor when the page is allocated for a write fault
(fault_dirty_shared_page() calls set_page_dirty()); but when allocated for
a read fault (which MAP_POPULATE simulates), no set_page_dirty().
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1807111741430.1106@eggly.anvils
Fixes: d21b9e57c74c ("thp: handle file pages in split_huge_pmd()")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Ashwin Chaugule <ashwinch(a)google.com>
Reviewed-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: <stable(a)vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/huge_memory.c | 2 ++
1 file changed, 2 insertions(+)
diff -puN mm/huge_memory.c~thp-fix-data-loss-when-splitting-a-file-pmd mm/huge_memory.c
--- a/mm/huge_memory.c~thp-fix-data-loss-when-splitting-a-file-pmd
+++ a/mm/huge_memory.c
@@ -2084,6 +2084,8 @@ static void __split_huge_pmd_locked(stru
if (vma_is_dax(vma))
return;
page = pmd_page(_pmd);
+ if (!PageDirty(page) && pmd_dirty(_pmd))
+ set_page_dirty(page);
if (!PageReferenced(page) && pmd_young(_pmd))
SetPageReferenced(page);
page_remove_rmap(page, true);
_
This is a note to let you know that I've just added the patch titled
USB: serial: sierra: fix potential deadlock at close
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the usb-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From e60870012e5a35b1506d7b376fddfb30e9da0b27 Mon Sep 17 00:00:00 2001
From: John Ogness <john.ogness(a)linutronix.de>
Date: Sun, 24 Jun 2018 00:32:11 +0200
Subject: USB: serial: sierra: fix potential deadlock at close
The portdata spinlock can be taken in interrupt context (via
sierra_outdat_callback()).
Disable interrupts when taking the portdata spinlock when discarding
deferred URBs during close to prevent a possible deadlock.
Fixes: 014333f77c0b ("USB: sierra: fix urb and memory leak on disconnect")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: John Ogness <john.ogness(a)linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
[ johan: amend commit message and add fixes and stable tags ]
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/usb/serial/sierra.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/serial/sierra.c b/drivers/usb/serial/sierra.c
index d189f953c891..55956a638f5b 100644
--- a/drivers/usb/serial/sierra.c
+++ b/drivers/usb/serial/sierra.c
@@ -770,9 +770,9 @@ static void sierra_close(struct usb_serial_port *port)
kfree(urb->transfer_buffer);
usb_free_urb(urb);
usb_autopm_put_interface_async(serial->interface);
- spin_lock(&portdata->lock);
+ spin_lock_irq(&portdata->lock);
portdata->outstanding_urbs--;
- spin_unlock(&portdata->lock);
+ spin_unlock_irq(&portdata->lock);
}
sierra_stop_rx_urbs(port);
--
2.18.0
This is a note to let you know that I've just added the patch titled
usb: xhci: Fix memory leak in xhci_endpoint_reset()
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From d89b7664f76047e7beca8f07e86f2ccfad085a28 Mon Sep 17 00:00:00 2001
From: Zheng Xiaowei <zhengxiaowei(a)ruijie.com.cn>
Date: Fri, 20 Jul 2018 18:05:11 +0300
Subject: usb: xhci: Fix memory leak in xhci_endpoint_reset()
If td_list is not empty the cfg_cmd will not be freed,
call xhci_free_command to free it.
Signed-off-by: Zheng Xiaowei <zhengxiaowei(a)ruijie.com.cn>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 2f4850f25e82..68e6132aa8b2 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -3051,6 +3051,7 @@ static void xhci_endpoint_reset(struct usb_hcd *hcd,
if (!list_empty(&ep->ring->td_list)) {
dev_err(&udev->dev, "EP not empty, refuse reset\n");
spin_unlock_irqrestore(&xhci->lock, flags);
+ xhci_free_command(xhci, cfg_cmd);
goto cleanup;
}
xhci_queue_stop_endpoint(xhci, stop_cmd, udev->slot_id, ep_index, 0);
--
2.18.0
Based on USB2.0 Spec Section 11.12.5,
"If a hub has per-port power switching and per-port current limiting,
an over-current on one port may still cause the power on another port
to fall below specific minimums. In this case, the affected port is
placed in the Power-Off state and C_PORT_OVER_CURRENT is set for the
port, but PORT_OVER_CURRENT is not set."
so let's check C_PORT_OVER_CURRENT too for over current condition.
Fixes: 08d1dec6f405 ("usb:hub set hub->change_bits when over-current happens")
Cc: <stable(a)vger.kernel.org>
Tested-by: Alessandro Antenucci <antenucci(a)korg.it>
Signed-off-by: Bin Liu <b-liu(a)ti.com>
---
drivers/usb/core/hub.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index fcae521df29b..1fb266809966 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -1142,10 +1142,14 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
if (!udev || udev->state == USB_STATE_NOTATTACHED) {
/* Tell hub_wq to disconnect the device or
- * check for a new connection
+ * check for a new connection or over current condition.
+ * Based on USB2.0 Spec Section 11.12.5,
+ * C_PORT_OVER_CURRENT could be set while
+ * PORT_OVER_CURRENT is not. So check for any of them.
*/
if (udev || (portstatus & USB_PORT_STAT_CONNECTION) ||
- (portstatus & USB_PORT_STAT_OVERCURRENT))
+ (portstatus & USB_PORT_STAT_OVERCURRENT) ||
+ (portchange & USB_PORT_STAT_C_OVERCURRENT))
set_bit(port1, hub->change_bits);
} else if (portstatus & USB_PORT_STAT_ENABLE) {
--
1.9.1
This is a note to let you know that I've just added the patch titled
usb: gadget: f_fs: Only return delayed status when len is 0
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 4d644abf25698362bd33d17c9ddc8f7122c30f17 Mon Sep 17 00:00:00 2001
From: Jerry Zhang <zhangjerry(a)google.com>
Date: Mon, 2 Jul 2018 12:48:08 -0700
Subject: usb: gadget: f_fs: Only return delayed status when len is 0
Commit 1b9ba000 ("Allow function drivers to pause control
transfers") states that USB_GADGET_DELAYED_STATUS is only
supported if data phase is 0 bytes.
It seems that when the length is not 0 bytes, there is no
need to explicitly delay the data stage since the transfer
is not completed until the user responds. However, when the
length is 0, there is no data stage and the transfer is
finished once setup() returns, hence there is a need to
explicitly delay completion.
This manifests as the following bugs:
Prior to 946ef68ad4e4 ('Let setup() return
USB_GADGET_DELAYED_STATUS'), when setup is 0 bytes, ffs
would require user to queue a 0 byte request in order to
clear setup state. However, that 0 byte request was actually
not needed and would hang and cause errors in other setup
requests.
After the above commit, 0 byte setups work since the gadget
now accepts empty queues to ep0 to clear the delay, but all
other setups hang.
Fixes: 946ef68ad4e4 ("Let setup() return USB_GADGET_DELAYED_STATUS")
Signed-off-by: Jerry Zhang <zhangjerry(a)google.com>
Cc: stable <stable(a)vger.kernel.org>
Acked-by: Felipe Balbi <felipe.balbi(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/gadget/function/f_fs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index 33e2030503fa..3ada83d81bda 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -3263,7 +3263,7 @@ static int ffs_func_setup(struct usb_function *f,
__ffs_event_add(ffs, FUNCTIONFS_SETUP);
spin_unlock_irqrestore(&ffs->ev.waitq.lock, flags);
- return USB_GADGET_DELAYED_STATUS;
+ return creq->wLength == 0 ? USB_GADGET_DELAYED_STATUS : 0;
}
static bool ffs_func_req_match(struct usb_function *f,
--
2.18.0
On Fri, Jul 20, 2018 at 10:00:34AM +0200, Dmitry Vyukov wrote:
> On Fri, Jul 20, 2018 at 9:53 AM, James Chapman <jchapman(a)katalix.com> wrote:
> > On 18/07/18 12:00, Dmitry Vyukov wrote:
> >> On Tue, Jan 16, 2018 at 7:29 PM, syzbot
> >> <syzbot+065d0fc357520c8f6039(a)syzkaller.appspotmail.com> wrote:
> >>> Hello,
> >>>
> >>> syzkaller hit the following crash on
> >>> a8750ddca918032d6349adbf9a4b6555e7db20da
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> >>> compiler: gcc (GCC) 7.1.1 20170620
> >>> .config is attached
> >>> Raw console output is attached.
> >>> Unfortunately, I don't have any reproducer for this bug yet.
> >>>
> >>>
> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >>> Reported-by: syzbot+065d0fc357520c8f6039(a)syzkaller.appspotmail.com
> >>> It will help syzbot understand when the bug is fixed. See footer for
> >>> details.
> >>> If you forward the report, please keep this part and the footer.
> >>
> >> James,
> >>
> >> Did you fix this? You asked syzbot to test a fix for this bug some time ago.
> >> If yes, did you include the Reported-by tag in the commit? This bug is
> >> still considered open by syzbot. But it stopped happening ~4 months
> >> ago:
> >
> > Yes, I think this has been fixed now. I think it was fixed by
> > Guillaume's 6b9f34239b00e6956a267abed2bc559ede556ad6 that was actually
> > to fix another syzbot bug fbeeb5c3b538e8545644 which looks similar to
> > this one.
> >
> >> https://syzkaller.appspot.com/bug?id=6fed0854381422329e78d7e16fb9cf4af8c9ae…
> >> We are also seeing these crashes in 4.4 and 4.9, it would be good to
> >> backport the fix.
> >
> > It looks like 6b9f34239b00e6956a267abed2bc559ede556ad6 hasn't made it to
> > 4.9 or 4.4.
>
> Thanks for the update!
>
> Let's tell syzbot that this is fixed:
>
> #syz fix: l2tp: fix races in tunnel creation
>
> Greg H: so this is probably the patch we need.
>
> +Greg KH: I think we need this in stable, we hit this in both 4.4 and 4.9.
It's also needed in 4.14.y. But it doesn't apply to any of those kernel
trees cleanly, can someone please provide a working backport?
thanks,
greg k-h
Sehr geehrte Damen und Herren,
Nach der Analyse Ihrer Internetseite haben wir Fehler im Code ermittelt, die einen groβen Einfluss darauf haben, dass Ihre Webseite eine niedrige Position in den Suchmaschinen,
darunter auch in der wichtigsten, d. h. auf Google, einnimmt.
Wir bieten Ihnen die Optimierung Ihrer Webseite ohne Abonnement, ohne monatliche Gebühren, ohne versteckte Kosten und ohne Strafen für den Vertragsbruch an.
Die Optimierung Ihrer Internetseite wird für die bessere Suchmaschinenfreundlichkeit sorgen und dadurch wird auch ihre Position im Google-Ranking und in anderen Suchmaschinen steigen.
***
Aktionspreis: 240 € (inkl. MwSt.) - bis zum 20.07.2018
***
Mehr Informationen finden Sie auf unserer Internetseite:
http://www.web-suchmaschinenoptimierung.net
Mit freundlichen Grüßen
Martin Schoch
Web Analytics
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3f6e6986045d47f87bd982910821b7ab9758487e Mon Sep 17 00:00:00 2001
From: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Date: Sat, 23 Jun 2018 01:06:34 +0900
Subject: [PATCH] mtd: rawnand: denali_dt: set clk_x_rate to 200 MHz
unconditionally
Since commit 1bb88666775e ("mtd: nand: denali: handle timing parameters
by setup_data_interface()"), denali_dt.c gets the clock rate from the
clock driver. The driver expects the frequency of the bus interface
clock, whereas the clock driver of SOCFPGA provides the core clock.
Thus, the setup_data_interface() hook calculates timing parameters
based on a wrong frequency.
To make it work without relying on the clock driver, hard-code the clock
frequency, 200MHz. This is fine for existing DT of UniPhier, and also
fixes the issue of SOCFPGA because both platforms use 200 MHz for the
bus interface clock.
Fixes: 1bb88666775e ("mtd: nand: denali: handle timing parameters by setup_data_interface()")
Cc: linux-stable <stable(a)vger.kernel.org> #4.14+
Reported-by: Philipp Rosenberger <p.rosenberger(a)linutronix.de>
Suggested-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Tested-by: Richard Weinberger <richard(a)nod.at>
Signed-off-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
diff --git a/drivers/mtd/nand/raw/denali_dt.c b/drivers/mtd/nand/raw/denali_dt.c
index cfd33e6ca77f..5869e90cc14b 100644
--- a/drivers/mtd/nand/raw/denali_dt.c
+++ b/drivers/mtd/nand/raw/denali_dt.c
@@ -123,7 +123,11 @@ static int denali_dt_probe(struct platform_device *pdev)
if (ret)
return ret;
- denali->clk_x_rate = clk_get_rate(dt->clk);
+ /*
+ * Hardcode the clock rate for the backward compatibility.
+ * This works for both SOCFPGA and UniPhier.
+ */
+ denali->clk_x_rate = 200000000;
ret = denali_init(denali);
if (ret)
The patch titled
Subject: mm: memcg: fix use after free in mem_cgroup_iter()
has been added to the -mm tree. Its filename is
mm-memcg-fix-use-after-free-in-mem_cgroup_iter.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-fix-use-after-free-in-mem…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-fix-use-after-free-in-mem…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Jing Xia <jing.xia.mail(a)gmail.com>
Subject: mm: memcg: fix use after free in mem_cgroup_iter()
It was reported that a kernel crash happened in mem_cgroup_iter(), which
can be triggered if the legacy cgroup-v1 non-hierarchical mode is used.
Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f
......
Call trace:
mem_cgroup_iter+0x2e0/0x6d4
shrink_zone+0x8c/0x324
balance_pgdat+0x450/0x640
kswapd+0x130/0x4b8
kthread+0xe8/0xfc
ret_from_fork+0x10/0x20
mem_cgroup_iter():
......
if (css_tryget(css)) <-- crash here
break;
......
The crashing reason is that mem_cgroup_iter() uses the memcg object whose
pointer is stored in iter->position, which has been freed before and
filled with POISON_FREE(0x6b).
And the root cause of the use-after-free issue is that
invalidate_reclaim_iterators() fails to reset the value of iter->position
to NULL when the css of the memcg is released in non- hierarchical mode.
Link: http://lkml.kernel.org/r/1531994807-25639-1-git-send-email-jing.xia@unisoc.…
Fixes: 6df38689e0e9 ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim")
Signed-off-by: Jing Xia <jing.xia.mail(a)gmail.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Cc: <chunyan.zhang(a)unisoc.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
diff -puN mm/memcontrol.c~mm-memcg-fix-use-after-free-in-mem_cgroup_iter mm/memcontrol.c
--- a/mm/memcontrol.c~mm-memcg-fix-use-after-free-in-mem_cgroup_iter
+++ a/mm/memcontrol.c
@@ -850,7 +850,7 @@ static void invalidate_reclaim_iterators
int nid;
int i;
- while ((memcg = parent_mem_cgroup(memcg))) {
+ for (; memcg; memcg = parent_mem_cgroup(memcg)) {
for_each_node(nid) {
mz = mem_cgroup_nodeinfo(memcg, nid);
for (i = 0; i <= DEF_PRIORITY; i++) {
_
Patches currently in -mm which might be from jing.xia.mail(a)gmail.com are
mm-memcg-fix-use-after-free-in-mem_cgroup_iter.patch
A report from Colin Ian King pointed a CoverityScan issue where error
values on these helpers where not checked in the drivers. These
helpers can error out only in case of a software bug in driver code,
not because of a runtime/hardware error. Hence, let's WARN_ON() in this
case and return 0 which is harmless anyway.
Fixes: 8878b126df76 ("mtd: nand: add ->exec_op() implementation")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
Changes since v1:
=================
* At first I decided to continue returning negative errors and
handling these cases in the drivers. Not sure this was the right thing
to do as reported by Boris so now the core WARN_ON() on error (only
due to some bug in a controller driver) and return an harmless
value. The drivers are not touched anymore, hence this patch is alone
now.
drivers/mtd/nand/raw/nand_base.c | 44 ++++++++++++++++++++--------------------
include/linux/mtd/rawnand.h | 16 +++++++--------
2 files changed, 30 insertions(+), 30 deletions(-)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 4fa5e20d9690..9bb76ddff4be 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -2668,8 +2668,8 @@ static bool nand_subop_instr_is_valid(const struct nand_subop *subop,
return subop && instr_idx < subop->ninstrs;
}
-static int nand_subop_get_start_off(const struct nand_subop *subop,
- unsigned int instr_idx)
+static unsigned int nand_subop_get_start_off(const struct nand_subop *subop,
+ unsigned int instr_idx)
{
if (instr_idx)
return 0;
@@ -2688,12 +2688,12 @@ static int nand_subop_get_start_off(const struct nand_subop *subop,
*
* Given an address instruction, returns the offset of the first cycle to issue.
*/
-int nand_subop_get_addr_start_off(const struct nand_subop *subop,
- unsigned int instr_idx)
+unsigned int nand_subop_get_addr_start_off(const struct nand_subop *subop,
+ unsigned int instr_idx)
{
- if (!nand_subop_instr_is_valid(subop, instr_idx) ||
- subop->instrs[instr_idx].type != NAND_OP_ADDR_INSTR)
- return -EINVAL;
+ if (WARN_ON(!nand_subop_instr_is_valid(subop, instr_idx) ||
+ subop->instrs[instr_idx].type != NAND_OP_ADDR_INSTR))
+ return 0;
return nand_subop_get_start_off(subop, instr_idx);
}
@@ -2710,14 +2710,14 @@ EXPORT_SYMBOL_GPL(nand_subop_get_addr_start_off);
*
* Given an address instruction, returns the number of address cycle to issue.
*/
-int nand_subop_get_num_addr_cyc(const struct nand_subop *subop,
- unsigned int instr_idx)
+unsigned int nand_subop_get_num_addr_cyc(const struct nand_subop *subop,
+ unsigned int instr_idx)
{
int start_off, end_off;
- if (!nand_subop_instr_is_valid(subop, instr_idx) ||
- subop->instrs[instr_idx].type != NAND_OP_ADDR_INSTR)
- return -EINVAL;
+ if (WARN_ON(!nand_subop_instr_is_valid(subop, instr_idx) ||
+ subop->instrs[instr_idx].type != NAND_OP_ADDR_INSTR))
+ return 0;
start_off = nand_subop_get_addr_start_off(subop, instr_idx);
@@ -2742,12 +2742,12 @@ EXPORT_SYMBOL_GPL(nand_subop_get_num_addr_cyc);
*
* Given a data instruction, returns the offset to start from.
*/
-int nand_subop_get_data_start_off(const struct nand_subop *subop,
- unsigned int instr_idx)
+unsigned int nand_subop_get_data_start_off(const struct nand_subop *subop,
+ unsigned int instr_idx)
{
- if (!nand_subop_instr_is_valid(subop, instr_idx) ||
- !nand_instr_is_data(&subop->instrs[instr_idx]))
- return -EINVAL;
+ if (WARN_ON(!nand_subop_instr_is_valid(subop, instr_idx) ||
+ !nand_instr_is_data(&subop->instrs[instr_idx])))
+ return 0;
return nand_subop_get_start_off(subop, instr_idx);
}
@@ -2764,14 +2764,14 @@ EXPORT_SYMBOL_GPL(nand_subop_get_data_start_off);
*
* Returns the length of the chunk of data to send/receive.
*/
-int nand_subop_get_data_len(const struct nand_subop *subop,
- unsigned int instr_idx)
+unsigned int nand_subop_get_data_len(const struct nand_subop *subop,
+ unsigned int instr_idx)
{
int start_off = 0, end_off;
- if (!nand_subop_instr_is_valid(subop, instr_idx) ||
- !nand_instr_is_data(&subop->instrs[instr_idx]))
- return -EINVAL;
+ if (WARN_ON(!nand_subop_instr_is_valid(subop, instr_idx) ||
+ !nand_instr_is_data(&subop->instrs[instr_idx])))
+ return 0;
start_off = nand_subop_get_data_start_off(subop, instr_idx);
diff --git a/include/linux/mtd/rawnand.h b/include/linux/mtd/rawnand.h
index e383c7f32574..876a9dd47e74 100644
--- a/include/linux/mtd/rawnand.h
+++ b/include/linux/mtd/rawnand.h
@@ -1007,14 +1007,14 @@ struct nand_subop {
unsigned int last_instr_end_off;
};
-int nand_subop_get_addr_start_off(const struct nand_subop *subop,
- unsigned int op_id);
-int nand_subop_get_num_addr_cyc(const struct nand_subop *subop,
- unsigned int op_id);
-int nand_subop_get_data_start_off(const struct nand_subop *subop,
- unsigned int op_id);
-int nand_subop_get_data_len(const struct nand_subop *subop,
- unsigned int op_id);
+unsigned int nand_subop_get_addr_start_off(const struct nand_subop *subop,
+ unsigned int op_id);
+unsigned int nand_subop_get_num_addr_cyc(const struct nand_subop *subop,
+ unsigned int op_id);
+unsigned int nand_subop_get_data_start_off(const struct nand_subop *subop,
+ unsigned int op_id);
+unsigned int nand_subop_get_data_len(const struct nand_subop *subop,
+ unsigned int op_id);
/**
* struct nand_op_parser_addr_constraints - Constraints for address instructions
--
2.14.1
Tree/Branch: v4.4.142
Git describe: v4.4.142
Commit: ecb9989751 Linux 4.4.142
Build Time: 63 min 12 sec
Passed: 10 / 10 (100.00 %)
Failed: 0 / 10 ( 0.00 %)
Errors: 0
Warnings: 31
Section Mismatches: 0
-------------------------------------------------------------------------------
defconfigs with issues (other than build errors):
19 warnings 0 mismatches : arm64-allmodconfig
17 warnings 0 mismatches : x86_64-allmodconfig
-------------------------------------------------------------------------------
Warnings Summary: 31
3 warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
2 ../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
2 ../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
2 ../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2752 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2720 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4759:1: warning: the frame size of 2056 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2096 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 6784 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5864 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5840 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2304 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2104 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2552 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2544 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3264 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3248 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 3008 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 2992 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5296 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv0367.c:3147:1: warning: the frame size of 4144 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/stv0367.c:2490:1: warning: the frame size of 3424 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2984 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2976 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4336 bytes is larger than 2048 bytes [-Wframe-larger-than=]
1 ../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4328 bytes is larger than 2048 bytes [-Wframe-larger-than=]
===============================================================================
Detailed per-defconfig build reports below:
-------------------------------------------------------------------------------
arm64-allmodconfig : PASS, 0 errors, 19 warnings, 0 section mismatches
Warnings:
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
warning: (IMA) selects TCG_CRB which has unmet direct dependencies (TCG_TPM && X86 && ACPI)
../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 2992 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3248 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2544 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5840 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 6784 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv0367.c:2490:1: warning: the frame size of 3424 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2976 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4336 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2720 bytes is larger than 2048 bytes [-Wframe-larger-than=]
-------------------------------------------------------------------------------
x86_64-allmodconfig : PASS, 0 errors, 17 warnings, 0 section mismatches
Warnings:
../drivers/media/dvb-frontends/stv090x.c:1858:1: warning: the frame size of 3008 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2141:1: warning: the frame size of 2104 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2513:1: warning: the frame size of 2304 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4565:1: warning: the frame size of 2096 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1956:1: warning: the frame size of 3264 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1599:1: warning: the frame size of 5296 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1211:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4250:1: warning: the frame size of 4832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:4759:1: warning: the frame size of 2056 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:1168:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:2073:1: warning: the frame size of 2552 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3095:1: warning: the frame size of 5864 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv090x.c:3436:1: warning: the frame size of 5280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/stv0367.c:3147:1: warning: the frame size of 4144 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2401:1: warning: the frame size of 2984 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/media/dvb-frontends/cxd2841er.c:2282:1: warning: the frame size of 4328 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/net/ethernet/rocker/rocker.c:2172:1: warning: the frame size of 2752 bytes is larger than 2048 bytes [-Wframe-larger-than=]
-------------------------------------------------------------------------------
Passed with no errors, warnings or mismatches:
arm64-allnoconfig
arm-multi_v5_defconfig
arm-multi_v7_defconfig
x86_64-defconfig
arm-allmodconfig
arm-allnoconfig
x86_64-allnoconfig
arm64-defconfig
From: Paul Burton <paul.burton(a)mips.com>
commit 5a267832c2ec47b2dad0fdb291a96bb5b8869315 upstream.
The generic nmi_cpu_backtrace() function calls show_regs() when a struct
pt_regs is available, and dump_stack() otherwise. If we were to make use
of the generic nmi_cpu_backtrace() with MIPS' current implementation of
show_regs() this would mean that we see only register data with no
accompanying stack information, in contrast with our current
implementation which calls dump_stack() regardless of whether register
state is available.
In preparation for making use of the generic nmi_cpu_backtrace() to
implement arch_trigger_cpumask_backtrace(), have our implementation of
show_regs() call dump_stack() and drop the explicit dump_stack() call in
arch_dump_stack() which is invoked by arch_trigger_cpumask_backtrace().
This will allow the output we produce to remain the same after a later
patch switches to using nmi_cpu_backtrace(). It may mean that we produce
extra stack output in other uses of show_regs(), but this:
1) Seems harmless.
2) Is good for consistency between arch_trigger_cpumask_backtrace()
and other users of show_regs().
3) Matches the behaviour of the ARM & PowerPC architectures.
Marked for stable back to v4.9 as a prerequisite of the following patch
"MIPS: Call dump_stack() from show_regs()".
Signed-off-by: Paul Burton <paul.burton(a)mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/19596/
Cc: James Hogan <jhogan(a)kernel.org>
Cc: Ralf Baechle <ralf(a)linux-mips.org>
Cc: Huacai Chen <chenhc(a)lemote.com>
Cc: linux-mips(a)linux-mips.org
Cc: stable(a)vger.kernel.org
[ Huacai: backported to 4.4: The next patch is also backported to 4.4 ]
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
---
arch/mips/kernel/process.c | 4 ++--
arch/mips/kernel/traps.c | 1 +
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index 1ee603d..f96048a 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -637,8 +637,8 @@ static void arch_dump_stack(void *info)
if (regs)
show_regs(regs);
-
- dump_stack();
+ else
+ dump_stack();
}
void arch_trigger_all_cpu_backtrace(bool include_self)
diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
index 31ca2ed..1b90121 100644
--- a/arch/mips/kernel/traps.c
+++ b/arch/mips/kernel/traps.c
@@ -344,6 +344,7 @@ static void __show_regs(const struct pt_regs *regs)
void show_regs(struct pt_regs *regs)
{
__show_regs((struct pt_regs *)regs);
+ dump_stack();
}
void show_registers(struct pt_regs *regs)
--
2.7.0
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From dea39aca1d7aef1e2b95b07edeacf04cc8863a2e Mon Sep 17 00:00:00 2001
From: Stefan Wahren <stefan.wahren(a)i2se.com>
Date: Sun, 15 Jul 2018 21:53:20 +0200
Subject: [PATCH] net: lan78xx: Fix race in tx pending skb size calculation
The skb size calculation in lan78xx_tx_bh is in race with the start_xmit,
which could lead to rare kernel oopses. So protect the whole skb walk with
a spin lock. As a benefit we can unlink the skb directly.
This patch was tested on Raspberry Pi 3B+
Link: https://github.com/raspberrypi/linux/issues/2608
Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Floris Bos <bos(a)je-eigen-domein.nl>
Signed-off-by: Stefan Wahren <stefan.wahren(a)i2se.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 2e4130746c40..ed10d49eb5e0 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -3344,6 +3344,7 @@ static void lan78xx_tx_bh(struct lan78xx_net *dev)
pkt_cnt = 0;
count = 0;
length = 0;
+ spin_lock_irqsave(&tqp->lock, flags);
for (skb = tqp->next; pkt_cnt < tqp->qlen; skb = skb->next) {
if (skb_is_gso(skb)) {
if (pkt_cnt) {
@@ -3352,7 +3353,8 @@ static void lan78xx_tx_bh(struct lan78xx_net *dev)
}
count = 1;
length = skb->len - TX_OVERHEAD;
- skb2 = skb_dequeue(tqp);
+ __skb_unlink(skb, tqp);
+ spin_unlock_irqrestore(&tqp->lock, flags);
goto gso_skb;
}
@@ -3361,6 +3363,7 @@ static void lan78xx_tx_bh(struct lan78xx_net *dev)
skb_totallen = skb->len + roundup(skb_totallen, sizeof(u32));
pkt_cnt++;
}
+ spin_unlock_irqrestore(&tqp->lock, flags);
/* copy to a single skb */
skb = alloc_skb(skb_totallen, GFP_ATOMIC);
I'm announcing the release of the 4.4.142 kernel.
It's not an "essencial" upgrade, but a number of build problems with
perf are now resolved, and an x86 issue that some people might have hit
is now handled properly. If those were problems for you, please
upgrade.
The updated 4.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.4.y
and can be browsed at the normal kernel.org git web browser:
http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/x86/kernel/cpu/common.c | 7 ++++---
scripts/Kbuild.include | 5 +++--
tools/arch/x86/include/asm/unistd_32.h | 9 +++++++++
tools/arch/x86/include/asm/unistd_64.h | 9 +++++++++
tools/build/Build.include | 5 +++--
tools/perf/config/Makefile | 1 +
tools/perf/perf-sys.h | 18 ------------------
tools/perf/util/include/asm/unistd_32.h | 1 -
tools/perf/util/include/asm/unistd_64.h | 1 -
tools/scripts/Makefile.include | 2 ++
11 files changed, 32 insertions(+), 28 deletions(-)
Andy Lutomirski (1):
x86/cpu: Probe CPUID leaf 6 even when cpuid_level == 6
Arnaldo Carvalho de Melo (1):
perf tools: Move syscall number fallbacks from perf-sys.h to tools/arch/x86/include/asm/
Greg Kroah-Hartman (1):
Linux 4.4.142
Rasmus Villemoes (1):
Kbuild: fix # escaping in .cmd files for future Make
This is the start of the stable review cycle for the 4.4.142 release.
There are 3 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri Jul 20 14:51:50 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.142-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.4.142-rc1
Arnaldo Carvalho de Melo <acme(a)redhat.com>
perf tools: Move syscall number fallbacks from perf-sys.h to tools/arch/x86/include/asm/
Andy Lutomirski <luto(a)kernel.org>
x86/cpu: Probe CPUID leaf 6 even when cpuid_level == 6
Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Kbuild: fix # escaping in .cmd files for future Make
-------------
Diffstat:
Makefile | 4 ++--
arch/x86/kernel/cpu/common.c | 7 ++++---
scripts/Kbuild.include | 5 +++--
tools/arch/x86/include/asm/unistd_32.h | 9 +++++++++
tools/arch/x86/include/asm/unistd_64.h | 9 +++++++++
tools/build/Build.include | 5 +++--
tools/perf/config/Makefile | 1 +
tools/perf/perf-sys.h | 18 ------------------
tools/perf/util/include/asm/unistd_32.h | 1 -
tools/perf/util/include/asm/unistd_64.h | 1 -
tools/scripts/Makefile.include | 2 ++
11 files changed, 33 insertions(+), 29 deletions(-)
Every time I tried to upgrade my laptop from 3.10.x to 4.x I faced an
issue by which the fan would run at full speed upon resume. Bisecting
it showed me the issue was introduced in 3.17 by commit 821d6f0359b0
(ACPI / sleep: Do not save NVS for new machines to accelerate S3). This
code only affects machines built starting as of 2012, but this Asus
1025C laptop was made in 2012 and apparently needs the NVS data to be
saved, otherwise the CPU's thermal state is not properly reported on
resume and the fan runs at full speed upon resume.
Here's a very simple way to check if such a machine is affected :
# cat /sys/class/thermal/thermal_zone0/temp
55000
( now suspend, wait one second and resume )
# cat /sys/class/thermal/thermal_zone0/temp
0
(and after ~15 seconds the fan starts to spin)
Let's apply the same quirk as commit cbc00c13 (ACPI: save NVS memory
for Lenovo G50-45) and reuse the function it provides. Note that this
commit was already backported to 4.9.x but not 4.4.x.
Cc: <linux-acpi(a)vger.kernel.org>
Cc: <stable(a)vger.kernel.org> # 3.17+: requires cbc00c13
Signed-off-by: Willy Tarreau <w(a)1wt.eu>
---
drivers/acpi/sleep.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
index 974e584..af54d7b 100644
--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -338,6 +338,14 @@ static const struct dmi_system_id acpisleep_dmi_table[] __initconst = {
DMI_MATCH(DMI_PRODUCT_NAME, "K54HR"),
},
},
+ {
+ .callback = init_nvs_save_s3,
+ .ident = "Asus 1025C",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "1025C"),
+ },
+ },
/*
* https://bugzilla.kernel.org/show_bug.cgi?id=189431
* Lenovo G50-45 is a platform later than 2012, but needs nvs memory
--
2.8.0.rc2.1.gbe9624a
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 733d3e5497070d05971352ca5087bac83c197c3d Mon Sep 17 00:00:00 2001
From: Or Gerlitz <ogerlitz(a)mellanox.com>
Date: Thu, 31 May 2018 11:32:56 +0300
Subject: [PATCH] net/mlx5e: Avoid dealing with vport representors if not being
e-switch manager
In smartnic env, the host (PF) driver might not be an e-switch
manager, hence the switchdev mode representors are running on
the embedded cpu (EC) and not at the host.
As such, we should avoid dealing with vport representors if
not being esw manager.
While here, make sure to disallow eswitch switchdev related
setups through devlink if we are not esw managers.
Fixes: cb67b832921c ('net/mlx5e: Introduce SRIOV VF representors')
Signed-off-by: Or Gerlitz <ogerlitz(a)mellanox.com>
Reviewed-by: Eli Cohen <eli(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 56c1b6f5593e..dae4156a710d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2846,7 +2846,7 @@ void mlx5e_activate_priv_channels(struct mlx5e_priv *priv)
mlx5e_activate_channels(&priv->channels);
netif_tx_start_all_queues(priv->netdev);
- if (MLX5_VPORT_MANAGER(priv->mdev))
+ if (MLX5_ESWITCH_MANAGER(priv->mdev))
mlx5e_add_sqs_fwd_rules(priv);
mlx5e_wait_channels_min_rx_wqes(&priv->channels);
@@ -2857,7 +2857,7 @@ void mlx5e_deactivate_priv_channels(struct mlx5e_priv *priv)
{
mlx5e_redirect_rqts_to_drop(priv);
- if (MLX5_VPORT_MANAGER(priv->mdev))
+ if (MLX5_ESWITCH_MANAGER(priv->mdev))
mlx5e_remove_sqs_fwd_rules(priv);
/* FIXME: This is a W/A only for tx timeout watch dog false alarm when
@@ -4597,7 +4597,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev)
mlx5e_set_netdev_dev_addr(netdev);
#if IS_ENABLED(CONFIG_MLX5_ESWITCH)
- if (MLX5_VPORT_MANAGER(mdev))
+ if (MLX5_ESWITCH_MANAGER(mdev))
netdev->switchdev_ops = &mlx5e_switchdev_ops;
#endif
@@ -4753,7 +4753,7 @@ static void mlx5e_nic_enable(struct mlx5e_priv *priv)
mlx5e_enable_async_events(priv);
- if (MLX5_VPORT_MANAGER(priv->mdev))
+ if (MLX5_ESWITCH_MANAGER(priv->mdev))
mlx5e_register_vport_reps(priv);
if (netdev->reg_state != NETREG_REGISTERED)
@@ -4788,7 +4788,7 @@ static void mlx5e_nic_disable(struct mlx5e_priv *priv)
queue_work(priv->wq, &priv->set_rx_mode_work);
- if (MLX5_VPORT_MANAGER(priv->mdev))
+ if (MLX5_ESWITCH_MANAGER(priv->mdev))
mlx5e_unregister_vport_reps(priv);
mlx5e_disable_async_events(priv);
@@ -4972,7 +4972,7 @@ static void *mlx5e_add(struct mlx5_core_dev *mdev)
return NULL;
#ifdef CONFIG_MLX5_ESWITCH
- if (MLX5_VPORT_MANAGER(mdev)) {
+ if (MLX5_ESWITCH_MANAGER(mdev)) {
rpriv = mlx5e_alloc_nic_rep_priv(mdev);
if (!rpriv) {
mlx5_core_warn(mdev, "Failed to alloc NIC rep priv data\n");
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 60236f73373c..2b8040a3cdbd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -823,7 +823,7 @@ bool mlx5e_is_uplink_rep(struct mlx5e_priv *priv)
struct mlx5e_rep_priv *rpriv = priv->ppriv;
struct mlx5_eswitch_rep *rep;
- if (!MLX5_CAP_GEN(priv->mdev, vport_group_manager))
+ if (!MLX5_ESWITCH_MANAGER(priv->mdev))
return false;
rep = rpriv->rep;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index cecd201f0b73..91f1209886ff 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -1079,8 +1079,8 @@ static int mlx5_devlink_eswitch_check(struct devlink *devlink)
if (MLX5_CAP_GEN(dev, port_type) != MLX5_CAP_PORT_TYPE_ETH)
return -EOPNOTSUPP;
- if (!MLX5_CAP_GEN(dev, vport_group_manager))
- return -EOPNOTSUPP;
+ if(!MLX5_ESWITCH_MANAGER(dev))
+ return -EPERM;
if (dev->priv.eswitch->mode == SRIOV_NONE)
return -EOPNOTSUPP;
====================
READ ME. I have this stale email in my outgoing draft folder, and I
have no idea if I actually sent this out successfully or not.
Please double check, thanks!
====================
Please queue up the following networking bug fixes for v4.14 and v4.17
-stable, repectively.
Thank you!
From: "Gautham R. Shenoy" <ego(a)linux.vnet.ibm.com>
On 64-bit servers, SPRN_SPRG3 and its userspace read-only mirror
SPRN_USPRG3 are used as userspace VDSO write and read registers
respectively.
SPRN_SPRG3 is lost when we enter stop4 and above, and is currently not
restored. As a result, any read from SPRN_USPRG3 returns zero on an
exit from stop4 and above.
Thus in this situation, on POWER9, any call from sched_getcpu() always
returns zero, as on powerpc, we call __kernel_getcpu() which relies
upon SPRN_USPRG3 to report the CPU and NUMA node information.
Fix this by restoring SPRN_SPRG3 on wake up from a deep stop state
with the sprg_vdso value that is cached in PACA.
Fixes: e1c1cfed5432 ("powerpc/powernv: Save/Restore additional SPRs
for stop4 cpuidle")
Reported-by: Florian Weimer <fweimer(a)redhat.com>
Cc: <stable(a)vger.kernel.org> # 4.14
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Michael Neuling <mikey(a)neuling.org>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Cc: Vaidyanathan Srinivasan <svaidy(a)linux.vnet.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego(a)linux.vnet.ibm.com>
---
Change from v1:
Restoring the SPRG3 from paca->sprg_vdso instead of saving
it separately during stop-entry, as suggested by Mikey.
arch/powerpc/kernel/idle_book3s.S | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index d85d551..672ead8 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -144,7 +144,9 @@ power9_restore_additional_sprs:
mtspr SPRN_MMCR1, r4
ld r3, STOP_MMCR2(r13)
+ ld r4, PACA_SPRG_VDSO(r13)
mtspr SPRN_MMCR2, r3
+ mtspr SPRN_SPRG3, r4
blr
/*
--
1.8.3.1
chip->read_buf is left unassigned since commit 4da712e70294 ("mtd: nand:
fsmc: use ->exec_op()"), leading to a NULL pointer dereference when it's
called from fsmc_read_page_hwecc(). Fix that by using the appropriate
helper to read data out of the NAND.
Fixes: 4da712e70294 ("mtd: nand: fsmc: use ->exec_op()")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
---
drivers/mtd/nand/raw/fsmc_nand.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/raw/fsmc_nand.c b/drivers/mtd/nand/raw/fsmc_nand.c
index f4a5a317d4ae..e1086a010b88 100644
--- a/drivers/mtd/nand/raw/fsmc_nand.c
+++ b/drivers/mtd/nand/raw/fsmc_nand.c
@@ -740,7 +740,7 @@ static int fsmc_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
for (i = 0, s = 0; s < eccsteps; s++, i += eccbytes, p += eccsize) {
nand_read_page_op(chip, page, s * eccsize, NULL, 0);
chip->ecc.hwctl(mtd, NAND_ECC_READ);
- chip->read_buf(mtd, p, eccsize);
+ nand_read_data_op(chip, p, eccsize, false);
for (j = 0; j < eccbytes;) {
struct mtd_oob_region oobregion;
--
2.14.1
For nouveau, while the GPU is guaranteed to be on when a hotplug has
been received, the same assertion does not hold true if a connector
probe has been started by userspace without having had received a sysfs
event.
So ensure that any connector probing keeps the GPU alive for the
duration of the probe by introducing
drm_helper_probe_single_connector_modes_with_rpm(). It's the same as
drm_helper_probe_single_connector_modes, but it handles holding a power
reference to the device for the duration of the connector probe.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Reviewed-by: Karol Herbst <karolherbst(a)gmail.com>
Cc: Lukas Wunner <lukas(a)wunner.de>
Cc: stable(a)vger.kernel.org
---
Changes since v1:
- Add a generic helper to DRM to handle this
drivers/gpu/drm/drm_probe_helper.c | 31 +++++++++++++++++++++
drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_connector.c | 4 +--
include/drm/drm_crtc_helper.h | 7 +++--
4 files changed, 38 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c
index 527743394150..0a9d6748b854 100644
--- a/drivers/gpu/drm/drm_probe_helper.c
+++ b/drivers/gpu/drm/drm_probe_helper.c
@@ -31,6 +31,7 @@
#include <linux/export.h>
#include <linux/moduleparam.h>
+#include <linux/pm_runtime.h>
#include <drm/drmP.h>
#include <drm/drm_crtc.h>
@@ -541,6 +542,36 @@ int drm_helper_probe_single_connector_modes(struct drm_connector *connector,
}
EXPORT_SYMBOL(drm_helper_probe_single_connector_modes);
+/**
+ * drm_helper_probe_single_connector_modes_with_rpm - get complete set of
+ * display modes
+ * @connector: connector to probe
+ * @maxX: max width for modes
+ * @maxY: max height for modes
+ *
+ * Same as drm_helper_probe_single_connector_modes, except that it makes sure
+ * that the device is active by synchronously grabbing a runtime power
+ * reference while probing.
+ *
+ * Returns:
+ * The number of modes found on @connector.
+ */
+int drm_helper_probe_single_connector_modes_with_rpm(struct drm_connector *connector,
+ uint32_t maxX, uint32_t maxY)
+{
+ int ret;
+
+ ret = pm_runtime_get_sync(connector->dev->dev);
+ if (ret < 0 && ret != -EACCES)
+ return ret;
+
+ ret = drm_helper_probe_single_connector_modes(connector, maxX, maxY);
+
+ pm_runtime_put(connector->dev->dev);
+ return ret;
+}
+EXPORT_SYMBOL(drm_helper_probe_single_connector_modes_with_rpm);
+
/**
* drm_kms_helper_hotplug_event - fire off KMS hotplug events
* @dev: drm_device whose connector state changed
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index fa3ab618a0f9..c54767b50fd8 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -858,7 +858,7 @@ static const struct drm_connector_funcs
nv50_mstc = {
.reset = nouveau_conn_reset,
.detect = nv50_mstc_detect,
- .fill_modes = drm_helper_probe_single_connector_modes,
+ .fill_modes = drm_helper_probe_single_connector_modes_with_rpm,
.destroy = nv50_mstc_destroy,
.atomic_duplicate_state = nouveau_conn_atomic_duplicate_state,
.atomic_destroy_state = nouveau_conn_atomic_destroy_state,
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
index 2a45b4c2ceb0..8d9070779261 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
@@ -1088,7 +1088,7 @@ nouveau_connector_funcs = {
.reset = nouveau_conn_reset,
.detect = nouveau_connector_detect,
.force = nouveau_connector_force,
- .fill_modes = drm_helper_probe_single_connector_modes,
+ .fill_modes = drm_helper_probe_single_connector_modes_with_rpm,
.set_property = nouveau_connector_set_property,
.destroy = nouveau_connector_destroy,
.atomic_duplicate_state = nouveau_conn_atomic_duplicate_state,
@@ -1103,7 +1103,7 @@ nouveau_connector_funcs_lvds = {
.reset = nouveau_conn_reset,
.detect = nouveau_connector_detect_lvds,
.force = nouveau_connector_force,
- .fill_modes = drm_helper_probe_single_connector_modes,
+ .fill_modes = drm_helper_probe_single_connector_modes_with_rpm,
.set_property = nouveau_connector_set_property,
.destroy = nouveau_connector_destroy,
.atomic_duplicate_state = nouveau_conn_atomic_duplicate_state,
diff --git a/include/drm/drm_crtc_helper.h b/include/drm/drm_crtc_helper.h
index 6914633037a5..8f3f6d6fcc8c 100644
--- a/include/drm/drm_crtc_helper.h
+++ b/include/drm/drm_crtc_helper.h
@@ -64,9 +64,10 @@ int drm_helper_crtc_mode_set_base(struct drm_crtc *crtc, int x, int y,
struct drm_framebuffer *old_fb);
/* drm_probe_helper.c */
-int drm_helper_probe_single_connector_modes(struct drm_connector
- *connector, uint32_t maxX,
- uint32_t maxY);
+int drm_helper_probe_single_connector_modes(struct drm_connector *connector,
+ uint32_t maxX, uint32_t maxY);
+int drm_helper_probe_single_connector_modes_with_rpm(struct drm_connector *connector,
+ uint32_t maxX, uint32_t maxY);
int drm_helper_probe_detect(struct drm_connector *connector,
struct drm_modeset_acquire_ctx *ctx,
bool force);
--
2.17.1
When DP MST hubs get confused, they can occasionally stop responding for
a good bit of time up until the point where the DRM driver manages to
do the right DPCD accesses to get it to start responding again. In a
worst case scenario however, this process can take upwards of 10+
seconds.
Currently we use the default output_poll_changed handler
drm_fb_helper_output_poll_changed() to handle output polling, which
doesn't happen to grab any power references on the device when polling.
If we're unlucky enough to have a hub (such as Lenovo's infamous laptop
docks for the P5x/P7x series) that's easily startled and confused, this
can lead to a pretty nasty deadlock situation that looks like this:
- Hotplug event from hub happens, we enter
drm_fb_helper_output_poll_changed() and start communicating with the
hub
- While we're in drm_fb_helper_output_poll_changed() and attempting to
communicate with the hub, we end up confusing it and cause it to stop
responding for at least 10 seconds
- After 5 seconds of being in drm_fb_helper_output_poll_changed(), the
pm core attempts to put the GPU into autosuspend, which ends up
calling drm_kms_helper_poll_disable()
- While the runtime PM core is waiting in drm_kms_helper_poll_disable()
for the output poll to finish, we end up finally detecting an MST
display
- We notice the new display and tries to enable it, which triggers
an atomic commit which triggers a call to pm_runtime_get_sync()
- the output poll thread deadlocks the pm core waiting for the pm core
to finish the autosuspend request while the pm core waits for the
output poll thread to finish
Sample:
[ 246.669625] INFO: task kworker/4:0:37 blocked for more than 120 seconds.
[ 246.673398] Not tainted 4.18.0-rc5Lyude-Test+ #2
[ 246.675271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.676527] kworker/4:0 D 0 37 2 0x80000000
[ 246.677580] Workqueue: events output_poll_execute [drm_kms_helper]
[ 246.678704] Call Trace:
[ 246.679753] __schedule+0x322/0xaf0
[ 246.680916] schedule+0x33/0x90
[ 246.681924] schedule_preempt_disabled+0x15/0x20
[ 246.683023] __mutex_lock+0x569/0x9a0
[ 246.684035] ? kobject_uevent_env+0x117/0x7b0
[ 246.685132] ? drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[ 246.686179] mutex_lock_nested+0x1b/0x20
[ 246.687278] ? mutex_lock_nested+0x1b/0x20
[ 246.688307] drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[ 246.689420] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
[ 246.690462] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
[ 246.691570] output_poll_execute+0x198/0x1c0 [drm_kms_helper]
[ 246.692611] process_one_work+0x231/0x620
[ 246.693725] worker_thread+0x214/0x3a0
[ 246.694756] kthread+0x12b/0x150
[ 246.695856] ? wq_pool_ids_show+0x140/0x140
[ 246.696888] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.697998] ret_from_fork+0x3a/0x50
[ 246.699034] INFO: task kworker/0:1:60 blocked for more than 120 seconds.
[ 246.700153] Not tainted 4.18.0-rc5Lyude-Test+ #2
[ 246.701182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.702278] kworker/0:1 D 0 60 2 0x80000000
[ 246.703293] Workqueue: pm pm_runtime_work
[ 246.704393] Call Trace:
[ 246.705403] __schedule+0x322/0xaf0
[ 246.706439] ? wait_for_completion+0x104/0x190
[ 246.707393] schedule+0x33/0x90
[ 246.708375] schedule_timeout+0x3a5/0x590
[ 246.709289] ? mark_held_locks+0x58/0x80
[ 246.710208] ? _raw_spin_unlock_irq+0x2c/0x40
[ 246.711222] ? wait_for_completion+0x104/0x190
[ 246.712134] ? trace_hardirqs_on_caller+0xf4/0x190
[ 246.713094] ? wait_for_completion+0x104/0x190
[ 246.713964] wait_for_completion+0x12c/0x190
[ 246.714895] ? wake_up_q+0x80/0x80
[ 246.715727] ? get_work_pool+0x90/0x90
[ 246.716649] flush_work+0x1c9/0x280
[ 246.717483] ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
[ 246.718442] __cancel_work_timer+0x146/0x1d0
[ 246.719247] cancel_delayed_work_sync+0x13/0x20
[ 246.720043] drm_kms_helper_poll_disable+0x1f/0x30 [drm_kms_helper]
[ 246.721123] nouveau_pmops_runtime_suspend+0x3d/0xb0 [nouveau]
[ 246.721897] pci_pm_runtime_suspend+0x6b/0x190
[ 246.722825] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.723737] __rpm_callback+0x7a/0x1d0
[ 246.724721] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.725607] rpm_callback+0x24/0x80
[ 246.726553] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.727376] rpm_suspend+0x142/0x6b0
[ 246.728185] pm_runtime_work+0x97/0xc0
[ 246.728938] process_one_work+0x231/0x620
[ 246.729796] worker_thread+0x44/0x3a0
[ 246.730614] kthread+0x12b/0x150
[ 246.731395] ? wq_pool_ids_show+0x140/0x140
[ 246.732202] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.732878] ret_from_fork+0x3a/0x50
[ 246.733768] INFO: task kworker/4:2:422 blocked for more than 120 seconds.
[ 246.734587] Not tainted 4.18.0-rc5Lyude-Test+ #2
[ 246.735393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.736113] kworker/4:2 D 0 422 2 0x80000080
[ 246.736789] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
[ 246.737665] Call Trace:
[ 246.738490] __schedule+0x322/0xaf0
[ 246.739250] schedule+0x33/0x90
[ 246.739908] rpm_resume+0x19c/0x850
[ 246.740750] ? finish_wait+0x90/0x90
[ 246.741541] __pm_runtime_resume+0x4e/0x90
[ 246.742370] nv50_disp_atomic_commit+0x31/0x210 [nouveau]
[ 246.743124] drm_atomic_commit+0x4a/0x50 [drm]
[ 246.743775] restore_fbdev_mode_atomic+0x1c8/0x240 [drm_kms_helper]
[ 246.744603] restore_fbdev_mode+0x31/0x140 [drm_kms_helper]
[ 246.745373] drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xb0 [drm_kms_helper]
[ 246.746220] drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper]
[ 246.746884] drm_fb_helper_hotplug_event.part.28+0x96/0xb0 [drm_kms_helper]
[ 246.747675] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
[ 246.748544] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
[ 246.749439] nv50_mstm_hotplug+0x15/0x20 [nouveau]
[ 246.750111] drm_dp_send_link_address+0x177/0x1c0 [drm_kms_helper]
[ 246.750764] drm_dp_check_and_send_link_address+0xa8/0xd0 [drm_kms_helper]
[ 246.751602] drm_dp_mst_link_probe_work+0x51/0x90 [drm_kms_helper]
[ 246.752314] process_one_work+0x231/0x620
[ 246.752979] worker_thread+0x44/0x3a0
[ 246.753838] kthread+0x12b/0x150
[ 246.754619] ? wq_pool_ids_show+0x140/0x140
[ 246.755386] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.756162] ret_from_fork+0x3a/0x50
[ 246.756847]
Showing all locks held in the system:
[ 246.758261] 3 locks held by kworker/4:0/37:
[ 246.759016] #0: 00000000f8df4d2d ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.759856] #1: 00000000e6065461 ((work_completion)(&(&dev->mode_config.output_poll_work)->work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.760670] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[ 246.761516] 2 locks held by kworker/0:1/60:
[ 246.762274] #0: 00000000fff6be0f ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.762982] #1: 000000005ab44fb4 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.763890] 1 lock held by khungtaskd/64:
[ 246.764664] #0: 000000008cb8b5c3 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
[ 246.765588] 5 locks held by kworker/4:2/422:
[ 246.766440] #0: 00000000232f0959 ((wq_completion)"events_long"){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.767390] #1: 00000000bb59b134 ((work_completion)(&mgr->work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.768154] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_restore_fbdev_mode_unlocked+0x4c/0xb0 [drm_kms_helper]
[ 246.768966] #3: 000000004c8f0b6b (crtc_ww_class_acquire){+.+.}, at: restore_fbdev_mode_atomic+0x4b/0x240 [drm_kms_helper]
[ 246.769921] #4: 000000004c34a296 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8a/0x1b0 [drm]
[ 246.770839] 1 lock held by dmesg/1038:
[ 246.771739] 2 locks held by zsh/1172:
[ 246.772650] #0: 00000000836d0438 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
[ 246.773680] #1: 000000001f4f4d48 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870
[ 246.775522] =============================================
So, to fix this (and any other possible deadlock issues like this that
could occur in the output_poll_changed patch) we make sure that
nouveau's output_poll_changed functions grab a runtime power ref before
sending any hotplug events, and hold it until we're finished. We
introduce this through adding a generic DRM helper which other drivers
may reuse.
This fixes deadlock issues when in fbcon with nouveau on my P50, and
should fix it for everyone else's as well!
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Reviewed-by: Karol Herbst <karolherbst(a)gmail.com>
Cc: Lukas Wunner <lukas(a)wunner.de>
Cc: stable(a)vger.kernel.org
---
Changes since v1:
- Add a generic helper for DRM to handle this
drivers/gpu/drm/drm_fb_helper.c | 23 +++++++++++++++++++++++
drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +-
include/drm/drm_fb_helper.h | 5 +++++
3 files changed, 29 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 2ee1eaa66188..1ab2f3646526 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -34,6 +34,7 @@
#include <linux/sysrq.h>
#include <linux/slab.h>
#include <linux/module.h>
+#include <linux/pm_runtime.h>
#include <drm/drmP.h>
#include <drm/drm_crtc.h>
#include <drm/drm_fb_helper.h>
@@ -2928,6 +2929,28 @@ void drm_fb_helper_output_poll_changed(struct drm_device *dev)
}
EXPORT_SYMBOL(drm_fb_helper_output_poll_changed);
+/**
+ * drm_fb_helper_output_poll_changed_with_rpm - DRM mode
+ * config \.output_poll_changed
+ * helper for fbdev emulation
+ * @dev: DRM device
+ *
+ * Same as drm_fb_helper_output_poll_changed, except that it makes sure that
+ * the device is active by synchronously grabbing a runtime power reference
+ * while probing.
+ */
+void drm_fb_helper_output_poll_changed_with_rpm(struct drm_device *dev)
+{
+ int ret;
+
+ ret = pm_runtime_get_sync(dev->dev);
+ if (WARN_ON(ret < 0 && ret != -EACCES))
+ return;
+ drm_fb_helper_hotplug_event(dev->fb_helper);
+ pm_runtime_put(dev->dev);
+}
+EXPORT_SYMBOL(drm_fb_helper_output_poll_changed_with_rpm);
+
/* The Kconfig DRM_KMS_HELPER selects FRAMEBUFFER_CONSOLE (if !EXPERT)
* but the module doesn't depend on any fb console symbols. At least
* attempt to load fbcon to avoid leaving the system without a usable console.
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index eb3e41a78806..fa3ab618a0f9 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -2015,7 +2015,7 @@ nv50_disp_atomic_state_alloc(struct drm_device *dev)
static const struct drm_mode_config_funcs
nv50_disp_func = {
.fb_create = nouveau_user_framebuffer_create,
- .output_poll_changed = drm_fb_helper_output_poll_changed,
+ .output_poll_changed = drm_fb_helper_output_poll_changed_with_rpm,
.atomic_check = nv50_disp_atomic_check,
.atomic_commit = nv50_disp_atomic_commit,
.atomic_state_alloc = nv50_disp_atomic_state_alloc,
diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h
index b069433e7fc1..ca809bfbaebb 100644
--- a/include/drm/drm_fb_helper.h
+++ b/include/drm/drm_fb_helper.h
@@ -330,6 +330,7 @@ void drm_fb_helper_fbdev_teardown(struct drm_device *dev);
void drm_fb_helper_lastclose(struct drm_device *dev);
void drm_fb_helper_output_poll_changed(struct drm_device *dev);
+void drm_fb_helper_output_poll_changed_with_rpm(struct drm_device *dev);
#else
static inline void drm_fb_helper_prepare(struct drm_device *dev,
struct drm_fb_helper *helper,
@@ -564,6 +565,10 @@ static inline void drm_fb_helper_output_poll_changed(struct drm_device *dev)
{
}
+static inline void drm_fb_helper_output_poll_changed_with_rpm(struct drm_device *dev)
+{
+}
+
#endif
static inline int
--
2.17.1