This is a note to let you know that I've just added the patch titled
mm/zswap: use workqueue to destroy pool
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-zswap-use-workqueue-to-destroy-pool.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 200867af4dedfe7cb707f96773684de1d1fd21e6 Mon Sep 17 00:00:00 2001
From: Dan Streetman <ddstreet(a)ieee.org>
Date: Fri, 20 May 2016 16:59:54 -0700
Subject: mm/zswap: use workqueue to destroy pool
From: Dan Streetman <ddstreet(a)ieee.org>
commit 200867af4dedfe7cb707f96773684de1d1fd21e6 upstream.
Add a work_struct to struct zswap_pool, and change __zswap_pool_empty to
use the workqueue instead of using call_rcu().
When zswap destroys a pool no longer in use, it uses call_rcu() to
perform the destruction/freeing. Since that executes in softirq
context, it must not sleep. However, actually destroying the pool
involves freeing the per-cpu compressors (which requires locking the
cpu_add_remove_lock mutex) and freeing the zpool, for which the
implementation may sleep (e.g. zsmalloc calls kmem_cache_destroy, which
locks the slab_mutex). So if either mutex is currently taken, or any
other part of the compressor or zpool implementation sleeps, it will
result in a BUG().
It's not easy to reproduce this when changing zswap's params normally.
In testing with a loaded system, this does not fail:
$ cd /sys/module/zswap/parameters
$ echo lz4 > compressor ; echo zsmalloc > zpool
nor does this:
$ while true ; do
> echo lzo > compressor ; echo zbud > zpool
> sleep 1
> echo lz4 > compressor ; echo zsmalloc > zpool
> sleep 1
> done
although it's still possible either of those might fail, depending on
whether anything else besides zswap has locked the mutexes.
However, changing a parameter with no delay immediately causes the
schedule while atomic BUG:
$ while true ; do
> echo lzo > compressor ; echo lz4 > compressor
> done
This is essentially the same as Yu Zhao's proposed patch to zsmalloc,
but moved to zswap, to cover compressor and zpool freeing.
Fixes: f1c54846ee45 ("zswap: dynamic pool creation")
Signed-off-by: Dan Streetman <ddstreet(a)ieee.org>
Reported-by: Yu Zhao <yuzhao(a)google.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Dan Streetman <dan.streetman(a)canonical.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/zswap.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -123,7 +123,7 @@ struct zswap_pool {
struct crypto_comp * __percpu *tfm;
struct kref kref;
struct list_head list;
- struct rcu_head rcu_head;
+ struct work_struct work;
struct notifier_block notifier;
char tfm_name[CRYPTO_MAX_ALG_NAME];
};
@@ -667,9 +667,11 @@ static int __must_check zswap_pool_get(s
return kref_get_unless_zero(&pool->kref);
}
-static void __zswap_pool_release(struct rcu_head *head)
+static void __zswap_pool_release(struct work_struct *work)
{
- struct zswap_pool *pool = container_of(head, typeof(*pool), rcu_head);
+ struct zswap_pool *pool = container_of(work, typeof(*pool), work);
+
+ synchronize_rcu();
/* nobody should have been able to get a kref... */
WARN_ON(kref_get_unless_zero(&pool->kref));
@@ -689,7 +691,9 @@ static void __zswap_pool_empty(struct kr
WARN_ON(pool == zswap_pool_current());
list_del_rcu(&pool->list);
- call_rcu(&pool->rcu_head, __zswap_pool_release);
+
+ INIT_WORK(&pool->work, __zswap_pool_release);
+ schedule_work(&pool->work);
spin_unlock(&zswap_pools_lock);
}
Patches currently in stable-queue which might be from ddstreet(a)ieee.org are
queue-4.4/mm-zswap-use-workqueue-to-destroy-pool.patch
queue-4.4/zswap-don-t-param_set_charp-while-holding-spinlock.patch
This is a note to let you know that I've just added the patch titled
zswap: don't param_set_charp while holding spinlock
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
zswap-don-t-param_set_charp-while-holding-spinlock.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From fd5bb66cd934987e49557455b6497fc006521940 Mon Sep 17 00:00:00 2001
From: Dan Streetman <ddstreet(a)ieee.org>
Date: Mon, 27 Feb 2017 14:26:53 -0800
Subject: zswap: don't param_set_charp while holding spinlock
From: Dan Streetman <ddstreet(a)ieee.org>
commit fd5bb66cd934987e49557455b6497fc006521940 upstream.
Change the zpool/compressor param callback function to release the
zswap_pools_lock spinlock before calling param_set_charp, since that
function may sleep when it calls kmalloc with GFP_KERNEL.
While this problem has existed for a while, I wasn't able to trigger it
using a tight loop changing either/both the zpool and compressor params; I
think it's very unlikely to be an issue on the stable kernels, especially
since most zswap users will change the compressor and/or zpool from sysfs
only one time each boot - or zero times, if they add the params to the
kernel boot.
Fixes: c99b42c3529e ("zswap: use charp for zswap param strings")
Link: http://lkml.kernel.org/r/20170126155821.4545-1-ddstreet@ieee.org
Signed-off-by: Dan Streetman <dan.streetman(a)canonical.com>
Reported-by: Sergey Senozhatsky <sergey.senozhatsky.work(a)gmail.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Minchan Kim <minchan(a)kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/zswap.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -752,18 +752,22 @@ static int __zswap_param_set(const char
pool = zswap_pool_find_get(type, compressor);
if (pool) {
zswap_pool_debug("using existing", pool);
+ WARN_ON(pool == zswap_pool_current());
list_del_rcu(&pool->list);
- } else {
- spin_unlock(&zswap_pools_lock);
- pool = zswap_pool_create(type, compressor);
- spin_lock(&zswap_pools_lock);
}
+ spin_unlock(&zswap_pools_lock);
+
+ if (!pool)
+ pool = zswap_pool_create(type, compressor);
+
if (pool)
ret = param_set_charp(s, kp);
else
ret = -EINVAL;
+ spin_lock(&zswap_pools_lock);
+
if (!ret) {
put_pool = zswap_pool_current();
list_add_rcu(&pool->list, &zswap_pools);
Patches currently in stable-queue which might be from ddstreet(a)ieee.org are
queue-4.4/mm-zswap-use-workqueue-to-destroy-pool.patch
queue-4.4/zswap-don-t-param_set_charp-while-holding-spinlock.patch
This is a note to let you know that I've just added the patch titled
mm/page-writeback: fix dirty_ratelimit calculation
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-page-writeback-fix-dirty_ratelimit-calculation.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d59b1087a98e402ed9a7cc577f4da435f9a555f5 Mon Sep 17 00:00:00 2001
From: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Date: Tue, 15 Mar 2016 14:55:27 -0700
Subject: mm/page-writeback: fix dirty_ratelimit calculation
From: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
commit d59b1087a98e402ed9a7cc577f4da435f9a555f5 upstream.
Calculation of dirty_ratelimit sometimes is not correct. E.g. initial
values of dirty_ratelimit == INIT_BW and step == 0, lead to the
following result:
UBSAN: Undefined behaviour in ../mm/page-writeback.c:1286:7
shift exponent 25600 is too large for 64-bit type 'long unsigned int'
The fix is straightforward - make step 0 if the shift exponent is too
big.
Signed-off-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Wu Fengguang <fengguang.wu(a)intel.com>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/page-writeback.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1162,6 +1162,7 @@ static void wb_update_dirty_ratelimit(st
unsigned long balanced_dirty_ratelimit;
unsigned long step;
unsigned long x;
+ unsigned long shift;
/*
* The dirty rate will match the writeout rate in long term, except
@@ -1286,11 +1287,11 @@ static void wb_update_dirty_ratelimit(st
* rate itself is constantly fluctuating. So decrease the track speed
* when it gets close to the target. Helps eliminate pointless tremors.
*/
- step >>= dirty_ratelimit / (2 * step + 1);
- /*
- * Limit the tracking speed to avoid overshooting.
- */
- step = (step + 7) / 8;
+ shift = dirty_ratelimit / (2 * step + 1);
+ if (shift < BITS_PER_LONG)
+ step = DIV_ROUND_UP(step >> shift, 8);
+ else
+ step = 0;
if (dirty_ratelimit < balanced_dirty_ratelimit)
dirty_ratelimit += step;
Patches currently in stable-queue which might be from aryabinin(a)virtuozzo.com are
queue-4.4/net-mac80211-debugfs.c-prevent-build-failure-with-config_ubsan-y.patch
queue-4.4/mm-page-writeback-fix-dirty_ratelimit-calculation.patch
This is a note to let you know that I've just added the patch titled
mm/compaction: pass only pageblock aligned range to pageblock_pfn_to_page
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-compaction-pass-only-pageblock-aligned-range-to-pageblock_pfn_to_page.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From e1409c325fdc1fef7b3d8025c51892355f065d15 Mon Sep 17 00:00:00 2001
From: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Date: Tue, 15 Mar 2016 14:57:48 -0700
Subject: mm/compaction: pass only pageblock aligned range to pageblock_pfn_to_page
From: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
commit e1409c325fdc1fef7b3d8025c51892355f065d15 upstream.
pageblock_pfn_to_page() is used to check there is valid pfn and all
pages in the pageblock is in a single zone. If there is a hole in the
pageblock, passing arbitrary position to pageblock_pfn_to_page() could
cause to skip whole pageblock scanning, instead of just skipping the
hole page. For deterministic behaviour, it's better to always pass
pageblock aligned range to pageblock_pfn_to_page(). It will also help
further optimization on pageblock_pfn_to_page() in the following patch.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Aaron Lu <aaron.lu(a)intel.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Rik van Riel <riel(a)redhat.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/compaction.c | 41 ++++++++++++++++++++++++++++++-----------
1 file changed, 30 insertions(+), 11 deletions(-)
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -553,13 +553,17 @@ unsigned long
isolate_freepages_range(struct compact_control *cc,
unsigned long start_pfn, unsigned long end_pfn)
{
- unsigned long isolated, pfn, block_end_pfn;
+ unsigned long isolated, pfn, block_start_pfn, block_end_pfn;
LIST_HEAD(freelist);
pfn = start_pfn;
+ block_start_pfn = pfn & ~(pageblock_nr_pages - 1);
+ if (block_start_pfn < cc->zone->zone_start_pfn)
+ block_start_pfn = cc->zone->zone_start_pfn;
block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages);
for (; pfn < end_pfn; pfn += isolated,
+ block_start_pfn = block_end_pfn,
block_end_pfn += pageblock_nr_pages) {
/* Protect pfn from changing by isolate_freepages_block */
unsigned long isolate_start_pfn = pfn;
@@ -572,11 +576,13 @@ isolate_freepages_range(struct compact_c
* scanning range to right one.
*/
if (pfn >= block_end_pfn) {
+ block_start_pfn = pfn & ~(pageblock_nr_pages - 1);
block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages);
block_end_pfn = min(block_end_pfn, end_pfn);
}
- if (!pageblock_pfn_to_page(pfn, block_end_pfn, cc->zone))
+ if (!pageblock_pfn_to_page(block_start_pfn,
+ block_end_pfn, cc->zone))
break;
isolated = isolate_freepages_block(cc, &isolate_start_pfn,
@@ -862,18 +868,23 @@ unsigned long
isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn,
unsigned long end_pfn)
{
- unsigned long pfn, block_end_pfn;
+ unsigned long pfn, block_start_pfn, block_end_pfn;
/* Scan block by block. First and last block may be incomplete */
pfn = start_pfn;
+ block_start_pfn = pfn & ~(pageblock_nr_pages - 1);
+ if (block_start_pfn < cc->zone->zone_start_pfn)
+ block_start_pfn = cc->zone->zone_start_pfn;
block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages);
for (; pfn < end_pfn; pfn = block_end_pfn,
+ block_start_pfn = block_end_pfn,
block_end_pfn += pageblock_nr_pages) {
block_end_pfn = min(block_end_pfn, end_pfn);
- if (!pageblock_pfn_to_page(pfn, block_end_pfn, cc->zone))
+ if (!pageblock_pfn_to_page(block_start_pfn,
+ block_end_pfn, cc->zone))
continue;
pfn = isolate_migratepages_block(cc, pfn, block_end_pfn,
@@ -1091,7 +1102,9 @@ int sysctl_compact_unevictable_allowed _
static isolate_migrate_t isolate_migratepages(struct zone *zone,
struct compact_control *cc)
{
- unsigned long low_pfn, end_pfn;
+ unsigned long block_start_pfn;
+ unsigned long block_end_pfn;
+ unsigned long low_pfn;
unsigned long isolate_start_pfn;
struct page *page;
const isolate_mode_t isolate_mode =
@@ -1103,16 +1116,21 @@ static isolate_migrate_t isolate_migrate
* initialized by compact_zone()
*/
low_pfn = cc->migrate_pfn;
+ block_start_pfn = cc->migrate_pfn & ~(pageblock_nr_pages - 1);
+ if (block_start_pfn < zone->zone_start_pfn)
+ block_start_pfn = zone->zone_start_pfn;
/* Only scan within a pageblock boundary */
- end_pfn = ALIGN(low_pfn + 1, pageblock_nr_pages);
+ block_end_pfn = ALIGN(low_pfn + 1, pageblock_nr_pages);
/*
* Iterate over whole pageblocks until we find the first suitable.
* Do not cross the free scanner.
*/
- for (; end_pfn <= cc->free_pfn;
- low_pfn = end_pfn, end_pfn += pageblock_nr_pages) {
+ for (; block_end_pfn <= cc->free_pfn;
+ low_pfn = block_end_pfn,
+ block_start_pfn = block_end_pfn,
+ block_end_pfn += pageblock_nr_pages) {
/*
* This can potentially iterate a massively long zone with
@@ -1123,7 +1141,8 @@ static isolate_migrate_t isolate_migrate
&& compact_should_abort(cc))
break;
- page = pageblock_pfn_to_page(low_pfn, end_pfn, zone);
+ page = pageblock_pfn_to_page(block_start_pfn, block_end_pfn,
+ zone);
if (!page)
continue;
@@ -1142,8 +1161,8 @@ static isolate_migrate_t isolate_migrate
/* Perform the isolation */
isolate_start_pfn = low_pfn;
- low_pfn = isolate_migratepages_block(cc, low_pfn, end_pfn,
- isolate_mode);
+ low_pfn = isolate_migratepages_block(cc, low_pfn,
+ block_end_pfn, isolate_mode);
if (!low_pfn || cc->contended) {
acct_isolated(zone, cc);
Patches currently in stable-queue which might be from iamjoonsoo.kim(a)lge.com are
queue-4.4/mm-compaction-pass-only-pageblock-aligned-range-to-pageblock_pfn_to_page.patch
queue-4.4/mm-compaction-fix-invalid-free_pfn-and-compact_cached_free_pfn.patch
This is a note to let you know that I've just added the patch titled
locks: don't check for race with close when setting OFD lock
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
locks-don-t-check-for-race-with-close-when-setting-ofd-lock.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0752ba807b04ccd69cb4bc8bbf829a80ee208a3c Mon Sep 17 00:00:00 2001
From: Jeff Layton <jeff.layton(a)primarydata.com>
Date: Fri, 8 Jan 2016 07:30:43 -0500
Subject: locks: don't check for race with close when setting OFD lock
From: Jeff Layton <jeff.layton(a)primarydata.com>
commit 0752ba807b04ccd69cb4bc8bbf829a80ee208a3c upstream.
We don't clean out OFD locks on close(), so there's no need to check
for a race with them here. They'll get cleaned out at the same time
that flock locks are.
Signed-off-by: Jeff Layton <jeff.layton(a)primarydata.com>
Acked-by: "J. Bruce Fields" <bfields(a)fieldses.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Mel Gorman <mgorman(a)suse.de>
---
fs/locks.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2220,10 +2220,12 @@ int fcntl_setlk(unsigned int fd, struct
error = do_lock_file_wait(filp, cmd, file_lock);
/*
- * Attempt to detect a close/fcntl race and recover by
- * releasing the lock that was just acquired.
+ * Attempt to detect a close/fcntl race and recover by releasing the
+ * lock that was just acquired. There is no need to do that when we're
+ * unlocking though, or for OFD locks.
*/
- if (!error && file_lock->fl_type != F_UNLCK) {
+ if (!error && file_lock->fl_type != F_UNLCK &&
+ !(file_lock->fl_flags & FL_OFDLCK)) {
/*
* We need that spin_lock here - it prevents reordering between
* update of i_flctx->flc_posix and check for it done in
@@ -2362,10 +2364,12 @@ int fcntl_setlk64(unsigned int fd, struc
error = do_lock_file_wait(filp, cmd, file_lock);
/*
- * Attempt to detect a close/fcntl race and recover by
- * releasing the lock that was just acquired.
+ * Attempt to detect a close/fcntl race and recover by releasing the
+ * lock that was just acquired. There is no need to do that when we're
+ * unlocking though, or for OFD locks.
*/
- if (!error && file_lock->fl_type != F_UNLCK) {
+ if (!error && file_lock->fl_type != F_UNLCK &&
+ !(file_lock->fl_flags & FL_OFDLCK)) {
/*
* We need that spin_lock here - it prevents reordering between
* update of i_flctx->flc_posix and check for it done in
Patches currently in stable-queue which might be from jeff.layton(a)primarydata.com are
queue-4.4/locks-don-t-check-for-race-with-close-when-setting-ofd-lock.patch
This is a note to let you know that I've just added the patch titled
mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-compaction-fix-invalid-free_pfn-and-compact_cached_free_pfn.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 623446e4dc45b37740268165107cc63abb3022f0 Mon Sep 17 00:00:00 2001
From: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Date: Tue, 15 Mar 2016 14:57:45 -0700
Subject: mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
From: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
commit 623446e4dc45b37740268165107cc63abb3022f0 upstream.
free_pfn and compact_cached_free_pfn are the pointer that remember
restart position of freepage scanner. When they are reset or invalid,
we set them to zone_end_pfn because freepage scanner works in reverse
direction. But, because zone range is defined as [zone_start_pfn,
zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
not store it to free_pfn and compact_cached_free_pfn. Instead, we need
to store zone_end_pfn - 1 to them. There is one more thing we should
consider. Freepage scanner scan reversely by pageblock unit. If
free_pfn and compact_cached_free_pfn are set to middle of pageblock, it
regards that sitiation as that it already scans front part of pageblock
so we lose opportunity to scan there. To fix-up, this patch do
round_down() to guarantee that reset position will be pageblock aligned.
Note that thanks to the current pageblock_pfn_to_page() implementation,
actual access to zone_end_pfn doesn't happen until now. But, following
patch will change pageblock_pfn_to_page() so this patch is needed from
now on.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Aaron Lu <aaron.lu(a)intel.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Rik van Riel <riel(a)redhat.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/compaction.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -200,7 +200,8 @@ static void reset_cached_positions(struc
{
zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
- zone->compact_cached_free_pfn = zone_end_pfn(zone);
+ zone->compact_cached_free_pfn =
+ round_down(zone_end_pfn(zone) - 1, pageblock_nr_pages);
}
/*
@@ -1358,11 +1359,11 @@ static int compact_zone(struct zone *zon
*/
cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
cc->free_pfn = zone->compact_cached_free_pfn;
- if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) {
- cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1);
+ if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
+ cc->free_pfn = round_down(end_pfn - 1, pageblock_nr_pages);
zone->compact_cached_free_pfn = cc->free_pfn;
}
- if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) {
+ if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
cc->migrate_pfn = start_pfn;
zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;
Patches currently in stable-queue which might be from iamjoonsoo.kim(a)lge.com are
queue-4.4/mm-compaction-pass-only-pageblock-aligned-range-to-pageblock_pfn_to_page.patch
queue-4.4/mm-compaction-fix-invalid-free_pfn-and-compact_cached_free_pfn.patch
This is a note to let you know that I've just added the patch titled
locking/mutex: Allow next waiter lockless wakeup
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
locking-mutex-allow-next-waiter-lockless-wakeup.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 1329ce6fbbe4536592dfcfc8d64d61bfeb598fe6 Mon Sep 17 00:00:00 2001
From: Davidlohr Bueso <dave(a)stgolabs.net>
Date: Sun, 24 Jan 2016 18:23:43 -0800
Subject: locking/mutex: Allow next waiter lockless wakeup
From: Davidlohr Bueso <dave(a)stgolabs.net>
commit 1329ce6fbbe4536592dfcfc8d64d61bfeb598fe6 upstream.
Make use of wake-queues and enable the wakeup to occur after releasing the
wait_lock. This is similar to what we do with rtmutex top waiter,
slightly shortening the critical region and allow other waiters to
acquire the wait_lock sooner. In low contention cases it can also help
the recently woken waiter to find the wait_lock available (fastpath)
when it continues execution.
Reviewed-by: Waiman Long <Waiman.Long(a)hpe.com>
Signed-off-by: Davidlohr Bueso <dbueso(a)suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Ding Tianhong <dingtianhong(a)huawei.com>
Cc: Jason Low <jason.low2(a)hp.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Paul E. McKenney <paulmck(a)linux.vnet.ibm.com>
Cc: Paul E. McKenney <paulmck(a)us.ibm.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Tim Chen <tim.c.chen(a)linux.intel.com>
Cc: Waiman Long <waiman.long(a)hpe.com>
Cc: Will Deacon <Will.Deacon(a)arm.com>
Link: http://lkml.kernel.org/r/20160125022343.GA3322@linux-uzut.site
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/locking/mutex.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -719,6 +719,7 @@ static inline void
__mutex_unlock_common_slowpath(struct mutex *lock, int nested)
{
unsigned long flags;
+ WAKE_Q(wake_q);
/*
* As a performance measurement, release the lock before doing other
@@ -746,11 +747,11 @@ __mutex_unlock_common_slowpath(struct mu
struct mutex_waiter, list);
debug_mutex_wake_waiter(lock, waiter);
-
- wake_up_process(waiter->task);
+ wake_q_add(&wake_q, waiter->task);
}
spin_unlock_mutex(&lock->wait_lock, flags);
+ wake_up_q(&wake_q);
}
/*
Patches currently in stable-queue which might be from dave(a)stgolabs.net are
queue-4.4/locking-mutex-allow-next-waiter-lockless-wakeup.patch
queue-4.4/futex-replace-barrier-in-unqueue_me-with-read_once.patch
This is a note to let you know that I've just added the patch titled
futex: Replace barrier() in unqueue_me() with READ_ONCE()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
futex-replace-barrier-in-unqueue_me-with-read_once.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 29b75eb2d56a714190a93d7be4525e617591077a Mon Sep 17 00:00:00 2001
From: Jianyu Zhan <nasa4836(a)gmail.com>
Date: Mon, 7 Mar 2016 09:32:24 +0800
Subject: futex: Replace barrier() in unqueue_me() with READ_ONCE()
From: Jianyu Zhan <nasa4836(a)gmail.com>
commit 29b75eb2d56a714190a93d7be4525e617591077a upstream.
Commit e91467ecd1ef ("bug in futex unqueue_me") introduced a barrier() in
unqueue_me() to prevent the compiler from rereading the lock pointer which
might change after a check for NULL.
Replace the barrier() with a READ_ONCE() for the following reasons:
1) READ_ONCE() is a weaker form of barrier() that affects only the specific
load operation, while barrier() is a general compiler level memory barrier.
READ_ONCE() was not available at the time when the barrier was added.
2) Aside of that READ_ONCE() is descriptive and self explainatory while a
barrier without comment is not clear to the casual reader.
No functional change.
[ tglx: Massaged changelog ]
Signed-off-by: Jianyu Zhan <nasa4836(a)gmail.com>
Acked-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Acked-by: Darren Hart <dvhart(a)linux.intel.com>
Cc: dave(a)stgolabs.net
Cc: peterz(a)infradead.org
Cc: linux(a)rasmusvillemoes.dk
Cc: akpm(a)linux-foundation.org
Cc: fengguang.wu(a)intel.com
Cc: bigeasy(a)linutronix.de
Link: http://lkml.kernel.org/r/1457314344-5685-1-git-send-email-nasa4836@gmail.com
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Davidlohr Bueso <dbueso(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/futex.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1939,8 +1939,12 @@ static int unqueue_me(struct futex_q *q)
/* In the common case we don't take the spinlock, which is nice. */
retry:
- lock_ptr = q->lock_ptr;
- barrier();
+ /*
+ * q->lock_ptr can change between this read and the following spin_lock.
+ * Use READ_ONCE to forbid the compiler from reloading q->lock_ptr and
+ * optimizing lock_ptr out of the logic below.
+ */
+ lock_ptr = READ_ONCE(q->lock_ptr);
if (lock_ptr != NULL) {
spin_lock(lock_ptr);
/*
Patches currently in stable-queue which might be from nasa4836(a)gmail.com are
queue-4.4/futex-replace-barrier-in-unqueue_me-with-read_once.patch
CFL was missing from intel_early_ids[]. The PCI ID needs to be there to
allow the memory region to be stolen, otherwise we could have RAM being
arbitrarily overwritten if for example we keep using the UEFI framebuffer,
depending on how BIOS has set up the e820 map.
Fixes: b056f8f3d6b9 ("drm/i915/cfl: Add Coffee Lake PCI IDs for S Skus.")
Signed-off-by: Lucas De Marchi <lucas.demarchi(a)intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa(a)intel.com>
Cc: Jani Nikula <jani.nikula(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: David Airlie <airlied(a)linux.ie>
Cc: intel-gfx(a)lists.freedesktop.org
Cc: dri-devel(a)lists.freedesktop.org
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: x86(a)kernel.org
Cc: <stable(a)vger.kernel.org> # v4.13+ 0890540e21cf drm/i915: add GT number to intel_device_info
Cc: <stable(a)vger.kernel.org> # v4.13+ 41693fd52373 drm/i915/kbl: Change a KBL pci id to GT2 from GT1.5
Cc: <stable(a)vger.kernel.org> # v4.13+
Reviewed-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
---
v2: improve commit message, add Fixes tag and CC stable
arch/x86/kernel/early-quirks.c | 1 +
include/drm/i915_pciids.h | 6 ++++++
2 files changed, 7 insertions(+)
diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index 3cbb2c78a9df..bae0d32e327b 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -528,6 +528,7 @@ static const struct pci_device_id intel_early_ids[] __initconst = {
INTEL_SKL_IDS(&gen9_early_ops),
INTEL_BXT_IDS(&gen9_early_ops),
INTEL_KBL_IDS(&gen9_early_ops),
+ INTEL_CFL_IDS(&gen9_early_ops),
INTEL_GLK_IDS(&gen9_early_ops),
INTEL_CNL_IDS(&gen9_early_ops),
};
diff --git a/include/drm/i915_pciids.h b/include/drm/i915_pciids.h
index 972a25633525..c65e4489006d 100644
--- a/include/drm/i915_pciids.h
+++ b/include/drm/i915_pciids.h
@@ -392,6 +392,12 @@
INTEL_VGA_DEVICE(0x3EA8, info), /* ULT GT3 */ \
INTEL_VGA_DEVICE(0x3EA5, info) /* ULT GT3 */
+#define INTEL_CFL_IDS(info) \
+ INTEL_CFL_S_GT1_IDS(info), \
+ INTEL_CFL_S_GT2_IDS(info), \
+ INTEL_CFL_H_GT2_IDS(info), \
+ INTEL_CFL_U_GT3_IDS(info)
+
/* CNL U 2+2 */
#define INTEL_CNL_U_GT2_IDS(info) \
INTEL_VGA_DEVICE(0x5A52, info), \
--
2.14.3
Quoting Hans:
If we return 1 from our post_reset handler, then our disconnect handler
will be called immediately afterwards. Since pre_reset blocks all scsi
requests our disconnect handler will then hang in the scsi_remove_host
call.
This is esp. bad because our disconnect handler hanging for ever also
stops the USB subsys from enumerating any new USB devices, causes commands
like lsusb to hang, etc.
In practice this happens when unplugging some uas devices because the hub
code may see the device as needing a warm-reset and calls usb_reset_device
before seeing the disconnect. In this case uas_configure_endpoints fails
with -ENODEV. We do not want to print an error for this, so this commit
also silences the shost_printk for -ENODEV.
ENDQUOTE
However, if we do that we better drop any unconditional execution
and report to the SCSI subsystem that we have undergone a reset
but we are not operational now.
Signed-off-by: Oliver Neukum <oneukum(a)suse.com>
Reported-by: Hans de Goede <hdegoede(a)redhat.com>
CC: stable(a)vger.kernel.org
---
drivers/usb/storage/uas.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index 5d04c40ee40a..3b1b9695177a 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -1076,20 +1076,19 @@ static int uas_post_reset(struct usb_interface *intf)
return 0;
err = uas_configure_endpoints(devinfo);
- if (err) {
+ if (err && err != ENODEV)
shost_printk(KERN_ERR, shost,
"%s: alloc streams error %d after reset",
__func__, err);
- return 1;
- }
+ /* we must unblock the host in every case lest we deadlock */
spin_lock_irqsave(shost->host_lock, flags);
scsi_report_bus_reset(shost, 0);
spin_unlock_irqrestore(shost->host_lock, flags);
scsi_unblock_requests(shost);
- return 0;
+ return err ? 1 : 0;
}
static int uas_suspend(struct usb_interface *intf, pm_message_t message)
--
2.13.6