This is a note to let you know that I've just added the patch titled
futex: Remove requirement for lock_page() in get_futex_key()
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
futex-remove-requirement-for-lock_page-in-get_futex_key.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 65d8fc777f6dcfee12785c057a6b57f679641c90 Mon Sep 17 00:00:00 2001
From: Mel Gorman <mgorman(a)suse.de>
Date: Tue, 9 Feb 2016 11:15:14 -0800
Subject: futex: Remove requirement for lock_page() in get_futex_key()
From: Mel Gorman <mgorman(a)suse.de>
commit 65d8fc777f6dcfee12785c057a6b57f679641c90 upstream.
When dealing with key handling for shared futexes, we can drastically reduce
the usage/need of the page lock. 1) For anonymous pages, the associated futex
object is the mm_struct which does not require the page lock. 2) For inode
based, keys, we can check under RCU read lock if the page mapping is still
valid and take reference to the inode. This just leaves one rare race that
requires the page lock in the slow path when examining the swapcache.
Additionally realtime users currently have a problem with the page lock being
contended for unbounded periods of time during futex operations.
Task A
get_futex_key()
lock_page()
---> preempted
Now any other task trying to lock that page will have to wait until
task A gets scheduled back in, which is an unbound time.
With this patch, we pretty much have a lockless futex_get_key().
Experiments show that this patch can boost/speedup the hashing of shared
futexes with the perf futex benchmarks (which is good for measuring such
change) by up to 45% when there are high (> 100) thread counts on a 60 core
Westmere. Lower counts are pretty much in the noise range or less than 10%,
but mid range can be seen at over 30% overall throughput (hash ops/sec).
This makes anon-mem shared futexes much closer to its private counterpart.
Signed-off-by: Mel Gorman <mgorman(a)suse.de>
[ Ported on top of thp refcount rework, changelog, comments, fixes. ]
Signed-off-by: Davidlohr Bueso <dbueso(a)suse.de>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Chris Mason <clm(a)fb.com>
Cc: Darren Hart <dvhart(a)linux.intel.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: dave(a)stgolabs.net
Link: http://lkml.kernel.org/r/1455045314-8305-3-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Chenbo Feng <fengc(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/futex.c | 98 ++++++++++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 91 insertions(+), 7 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -400,6 +400,7 @@ get_futex_key(u32 __user *uaddr, int fsh
unsigned long address = (unsigned long)uaddr;
struct mm_struct *mm = current->mm;
struct page *page, *page_head;
+ struct address_space *mapping;
int err, ro = 0;
/*
@@ -478,7 +479,19 @@ again:
}
#endif
- lock_page(page_head);
+ /*
+ * The treatment of mapping from this point on is critical. The page
+ * lock protects many things but in this context the page lock
+ * stabilizes mapping, prevents inode freeing in the shared
+ * file-backed region case and guards against movement to swap cache.
+ *
+ * Strictly speaking the page lock is not needed in all cases being
+ * considered here and page lock forces unnecessarily serialization
+ * From this point on, mapping will be re-verified if necessary and
+ * page lock will be acquired only if it is unavoidable
+ */
+
+ mapping = READ_ONCE(page_head->mapping);
/*
* If page_head->mapping is NULL, then it cannot be a PageAnon
@@ -495,18 +508,31 @@ again:
* shmem_writepage move it from filecache to swapcache beneath us:
* an unlikely race, but we do need to retry for page_head->mapping.
*/
- if (!page_head->mapping) {
- int shmem_swizzled = PageSwapCache(page_head);
+ if (unlikely(!mapping)) {
+ int shmem_swizzled;
+
+ /*
+ * Page lock is required to identify which special case above
+ * applies. If this is really a shmem page then the page lock
+ * will prevent unexpected transitions.
+ */
+ lock_page(page);
+ shmem_swizzled = PageSwapCache(page) || page->mapping;
unlock_page(page_head);
put_page(page_head);
+
if (shmem_swizzled)
goto again;
+
return -EFAULT;
}
/*
* Private mappings are handled in a simple way.
*
+ * If the futex key is stored on an anonymous page, then the associated
+ * object is the mm which is implicitly pinned by the calling process.
+ *
* NOTE: When userspace waits on a MAP_SHARED mapping, even if
* it's a read-only handle, it's expected that futexes attach to
* the object not the particular process.
@@ -524,16 +550,74 @@ again:
key->both.offset |= FUT_OFF_MMSHARED; /* ref taken on mm */
key->private.mm = mm;
key->private.address = address;
+
+ get_futex_key_refs(key); /* implies smp_mb(); (B) */
+
} else {
+ struct inode *inode;
+
+ /*
+ * The associated futex object in this case is the inode and
+ * the page->mapping must be traversed. Ordinarily this should
+ * be stabilised under page lock but it's not strictly
+ * necessary in this case as we just want to pin the inode, not
+ * update the radix tree or anything like that.
+ *
+ * The RCU read lock is taken as the inode is finally freed
+ * under RCU. If the mapping still matches expectations then the
+ * mapping->host can be safely accessed as being a valid inode.
+ */
+ rcu_read_lock();
+
+ if (READ_ONCE(page_head->mapping) != mapping) {
+ rcu_read_unlock();
+ put_page(page_head);
+
+ goto again;
+ }
+
+ inode = READ_ONCE(mapping->host);
+ if (!inode) {
+ rcu_read_unlock();
+ put_page(page_head);
+
+ goto again;
+ }
+
+ /*
+ * Take a reference unless it is about to be freed. Previously
+ * this reference was taken by ihold under the page lock
+ * pinning the inode in place so i_lock was unnecessary. The
+ * only way for this check to fail is if the inode was
+ * truncated in parallel so warn for now if this happens.
+ *
+ * We are not calling into get_futex_key_refs() in file-backed
+ * cases, therefore a successful atomic_inc return below will
+ * guarantee that get_futex_key() will still imply smp_mb(); (B).
+ */
+ if (WARN_ON_ONCE(!atomic_inc_not_zero(&inode->i_count))) {
+ rcu_read_unlock();
+ put_page(page_head);
+
+ goto again;
+ }
+
+ /* Should be impossible but lets be paranoid for now */
+ if (WARN_ON_ONCE(inode->i_mapping != mapping)) {
+ err = -EFAULT;
+ rcu_read_unlock();
+ iput(inode);
+
+ goto out;
+ }
+
key->both.offset |= FUT_OFF_INODE; /* inode-based key */
- key->shared.inode = page_head->mapping->host;
+ key->shared.inode = inode;
key->shared.pgoff = basepage_index(page);
+ rcu_read_unlock();
}
- get_futex_key_refs(key); /* implies MB (B) */
-
out:
- unlock_page(page_head);
put_page(page_head);
return err;
}
Patches currently in stable-queue which might be from mgorman(a)suse.de are
queue-3.18/futex-remove-requirement-for-lock_page-in-get_futex_key.patch
This is a note to let you know that I've just added the patch titled
random: use lockless method of accessing and updating f->reg_idx
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
random-use-lockless-method-of-accessing-and-updating-f-reg_idx.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 92e75428ffc90e2a0321062379f883f3671cfebe Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso(a)mit.edu>
Date: Wed, 7 Jun 2017 19:01:32 -0400
Subject: random: use lockless method of accessing and updating f->reg_idx
From: Theodore Ts'o <tytso(a)mit.edu>
commit 92e75428ffc90e2a0321062379f883f3671cfebe upstream.
Linus pointed out that there is a much more efficient way of avoiding
the problem that we were trying to address in commit 9dfa7bba35ac0:
"fix race in drivers/char/random.c:get_reg()".
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: Michael Schmitz <schmitzmic(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/char/random.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1115,15 +1115,15 @@ static void add_interrupt_bench(cycles_t
static __u32 get_reg(struct fast_pool *f, struct pt_regs *regs)
{
__u32 *ptr = (__u32 *) regs;
- unsigned long flags;
+ unsigned int idx;
if (regs == NULL)
return 0;
- local_irq_save(flags);
- if (f->reg_idx >= sizeof(struct pt_regs) / sizeof(__u32))
- f->reg_idx = 0;
- ptr += f->reg_idx++;
- local_irq_restore(flags);
+ idx = READ_ONCE(f->reg_idx);
+ if (idx >= sizeof(struct pt_regs) / sizeof(__u32))
+ idx = 0;
+ ptr += idx++;
+ WRITE_ONCE(f->reg_idx, idx);
return *ptr;
}
Patches currently in stable-queue which might be from tytso(a)mit.edu are
queue-4.9/fix-race-in-drivers-char-random.c-get_reg.patch
queue-4.9/ext4-handle-the-rest-of-ext4_mb_load_buddy-enomem-errors.patch
queue-4.9/ext4-fix-off-by-one-on-max-nr_pages-in-ext4_find_unwritten_pgoff.patch
queue-4.9/random-use-lockless-method-of-accessing-and-updating-f-reg_idx.patch
This is a note to let you know that I've just added the patch titled
random: use lockless method of accessing and updating f->reg_idx
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
random-use-lockless-method-of-accessing-and-updating-f-reg_idx.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 92e75428ffc90e2a0321062379f883f3671cfebe Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso(a)mit.edu>
Date: Wed, 7 Jun 2017 19:01:32 -0400
Subject: random: use lockless method of accessing and updating f->reg_idx
From: Theodore Ts'o <tytso(a)mit.edu>
commit 92e75428ffc90e2a0321062379f883f3671cfebe upstream.
Linus pointed out that there is a much more efficient way of avoiding
the problem that we were trying to address in commit 9dfa7bba35ac0:
"fix race in drivers/char/random.c:get_reg()".
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: Michael Schmitz <schmitzmic(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/char/random.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -886,15 +886,15 @@ static void add_interrupt_bench(cycles_t
static __u32 get_reg(struct fast_pool *f, struct pt_regs *regs)
{
__u32 *ptr = (__u32 *) regs;
- unsigned long flags;
+ unsigned int idx;
if (regs == NULL)
return 0;
- local_irq_save(flags);
- if (f->reg_idx >= sizeof(struct pt_regs) / sizeof(__u32))
- f->reg_idx = 0;
- ptr += f->reg_idx++;
- local_irq_restore(flags);
+ idx = READ_ONCE(f->reg_idx);
+ if (idx >= sizeof(struct pt_regs) / sizeof(__u32))
+ idx = 0;
+ ptr += idx++;
+ WRITE_ONCE(f->reg_idx, idx);
return *ptr;
}
Patches currently in stable-queue which might be from tytso(a)mit.edu are
queue-4.4/fix-race-in-drivers-char-random.c-get_reg.patch
queue-4.4/ext4-handle-the-rest-of-ext4_mb_load_buddy-enomem-errors.patch
queue-4.4/ext4-fix-off-by-one-on-max-nr_pages-in-ext4_find_unwritten_pgoff.patch
queue-4.4/random-use-lockless-method-of-accessing-and-updating-f-reg_idx.patch
This is a note to let you know that I've just added the patch titled
random: use lockless method of accessing and updating f->reg_idx
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
random-use-lockless-method-of-accessing-and-updating-f-reg_idx.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 92e75428ffc90e2a0321062379f883f3671cfebe Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso(a)mit.edu>
Date: Wed, 7 Jun 2017 19:01:32 -0400
Subject: random: use lockless method of accessing and updating f->reg_idx
From: Theodore Ts'o <tytso(a)mit.edu>
commit 92e75428ffc90e2a0321062379f883f3671cfebe upstream.
Linus pointed out that there is a much more efficient way of avoiding
the problem that we were trying to address in commit 9dfa7bba35ac0:
"fix race in drivers/char/random.c:get_reg()".
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: Michael Schmitz <schmitzmic(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/char/random.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -863,15 +863,15 @@ static void add_interrupt_bench(cycles_t
static __u32 get_reg(struct fast_pool *f, struct pt_regs *regs)
{
__u32 *ptr = (__u32 *) regs;
- unsigned long flags;
+ unsigned int idx;
if (regs == NULL)
return 0;
- local_irq_save(flags);
- if (f->reg_idx >= sizeof(struct pt_regs) / sizeof(__u32))
- f->reg_idx = 0;
- ptr += f->reg_idx++;
- local_irq_restore(flags);
+ idx = READ_ONCE(f->reg_idx);
+ if (idx >= sizeof(struct pt_regs) / sizeof(__u32))
+ idx = 0;
+ ptr += idx++;
+ WRITE_ONCE(f->reg_idx, idx);
return *ptr;
}
Patches currently in stable-queue which might be from tytso(a)mit.edu are
queue-3.18/fix-race-in-drivers-char-random.c-get_reg.patch
queue-3.18/ext4-fix-off-by-one-on-max-nr_pages-in-ext4_find_unwritten_pgoff.patch
queue-3.18/random-use-lockless-method-of-accessing-and-updating-f-reg_idx.patch
Hi Greg,
Here is the backport of 3f29770723fe ("ipsec: check return value of
skb_to_sgvec always") for the 3.18, 4.4, and 4.9 trees, fixing the
warnings generated by 48a1df65334b ("skbuff: return -EMSGSIZE in
skb_to_sgvec to prevent overflow").
Let me know if there are any issues! First time using git send-email...
Nathan
The backport of 48a1df65334b ("skbuff: return -EMSGSIZE in skb_to_sgvec
to prevent overflow") was missing three supplemental commits:
3f29770723fe ("ipsec: check return value of skb_to_sgvec always")
89a5ea996625 ("rxrpc: check return value of skb_to_sgvec always")
e2fcad58fd23 ("virtio_net: check return value of skb_to_sgvec always")
I already sent the first one, this thread contains the other two for
3.18, 4.4, and 4.9. There is an additional patch because these trees do
not contain f6b10209b90d ("virtio-net: switch to use build_skb() for
small buffer"), which removed another call to skb_to_sgvec, which was
before f6b10209b90d. If you feel it should be squashed in f6b10209b90d,
please by all means feel free to do so. I was not 100% confident on if
the dev_kfree_skb call was necessary in the error path so I hope one of
the authors can comment on that.
Thanks!
Nathan
This is a note to let you know that I've just added the patch titled
xfrm: fix state migration copy replay sequence numbers
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
xfrm-fix-state-migration-copy-replay-sequence-numbers.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Apr 10 13:58:07 CEST 2018
From: Antony Antony <antony(a)phenome.org>
Date: Fri, 19 May 2017 12:47:00 +0200
Subject: xfrm: fix state migration copy replay sequence numbers
From: Antony Antony <antony(a)phenome.org>
[ Upstream commit a486cd23661c9387fb076c3f6ae8b2aa9d20d54a ]
During xfrm migration copy replay and preplay sequence numbers
from the previous state.
Here is a tcpdump output showing the problem.
10.0.10.46 is running vanilla kernel, is the IKE/IPsec responder.
After the migration it sent wrong sequence number, reset to 1.
The migration is from 10.0.0.52 to 10.0.0.53.
IP 10.0.0.52.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7cf), length 136
IP 10.0.10.46.4500 > 10.0.0.52.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x7cf), length 136
IP 10.0.0.52.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d0), length 136
IP 10.0.10.46.4500 > 10.0.0.52.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x7d0), length 136
IP 10.0.0.53.4500 > 10.0.10.46.4500: NONESP-encap: isakmp: child_sa inf2[I]
IP 10.0.10.46.4500 > 10.0.0.53.4500: NONESP-encap: isakmp: child_sa inf2[R]
IP 10.0.0.53.4500 > 10.0.10.46.4500: NONESP-encap: isakmp: child_sa inf2[I]
IP 10.0.10.46.4500 > 10.0.0.53.4500: NONESP-encap: isakmp: child_sa inf2[R]
IP 10.0.0.53.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d1), length 136
NOTE: next sequence is wrong 0x1
IP 10.0.10.46.4500 > 10.0.0.53.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x1), length 136
IP 10.0.0.53.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d2), length 136
IP 10.0.10.46.4500 > 10.0.0.53.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x2), length 136
Signed-off-by: Antony Antony <antony(a)phenome.org>
Reviewed-by: Richard Guy Briggs <rgb(a)tricolour.ca>
Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/xfrm/xfrm_state.c | 2 ++
1 file changed, 2 insertions(+)
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -1208,6 +1208,8 @@ static struct xfrm_state *xfrm_state_clo
x->curlft.add_time = orig->curlft.add_time;
x->km.state = orig->km.state;
x->km.seq = orig->km.seq;
+ x->replay = orig->replay;
+ x->preplay = orig->preplay;
return x;
Patches currently in stable-queue which might be from antony(a)phenome.org are
queue-3.18/xfrm-fix-state-migration-copy-replay-sequence-numbers.patch
This is a note to let you know that I've just added the patch titled
xen: avoid type warning in xchg_xen_ulong
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
xen-avoid-type-warning-in-xchg_xen_ulong.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Apr 10 13:58:07 CEST 2018
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Thu, 8 Jun 2017 10:53:10 +0200
Subject: xen: avoid type warning in xchg_xen_ulong
From: Arnd Bergmann <arnd(a)arndb.de>
[ Upstream commit 9cc91f212111cdcbefa02dcdb7dd443f224bf52c ]
The improved type-checking version of container_of() triggers a warning for
xchg_xen_ulong, pointing out that 'xen_ulong_t' is unsigned, but atomic64_t
contains a signed value:
drivers/xen/events/events_2l.c: In function 'evtchn_2l_handle_events':
drivers/xen/events/events_2l.c:187:1020: error: call to '__compiletime_assert_187' declared with attribute error: pointer type mismatch in container_of()
This adds a cast to work around the warning.
Cc: Ian Abbott <abbotti(a)mev.co.uk>
Fixes: 85323a991d40 ("xen: arm: mandate EABI and use generic atomic operations.")
Fixes: daa2ac80834d ("kernel.h: handle pointers to arrays better in container_of()")
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: Stefano Stabellini <sstabellini(a)kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini(a)kernel.org>
Acked-by: Ian Abbott <abbotti(a)mev.co.uk>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/include/asm/xen/events.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/include/asm/xen/events.h
+++ b/arch/arm/include/asm/xen/events.h
@@ -16,7 +16,7 @@ static inline int xen_irqs_disabled(stru
return raw_irqs_disabled_flags(regs->ARM_cpsr);
}
-#define xchg_xen_ulong(ptr, val) atomic64_xchg(container_of((ptr), \
+#define xchg_xen_ulong(ptr, val) atomic64_xchg(container_of((long long*)(ptr),\
atomic64_t, \
counter), (val))
Patches currently in stable-queue which might be from arnd(a)arndb.de are
queue-3.18/xen-avoid-type-warning-in-xchg_xen_ulong.patch
This is a note to let you know that I've just added the patch titled
x86/tsc: Provide 'tsc=unstable' boot parameter
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-tsc-provide-tsc-unstable-boot-parameter.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Apr 10 13:58:07 CEST 2018
From: Peter Zijlstra <peterz(a)infradead.org>
Date: Thu, 13 Apr 2017 14:56:44 +0200
Subject: x86/tsc: Provide 'tsc=unstable' boot parameter
From: Peter Zijlstra <peterz(a)infradead.org>
[ Upstream commit 8309f86cd41e8714526867177facf7a316d9be53 ]
Since the clocksource watchdog will only detect broken TSC after the
fact, all TSC based clocks will likely have observed non-continuous
values before/when switching away from TSC.
Therefore only thing to fully avoid random clock movement when your
BIOS randomly mucks with TSC values from SMI handlers is reporting the
TSC as unstable at boot.
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mike Galbraith <efault(a)gmx.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/tsc.c | 2 ++
1 file changed, 2 insertions(+)
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -356,6 +356,8 @@ static int __init tsc_setup(char *str)
tsc_clocksource_reliable = 1;
if (!strncmp(str, "noirqtime", 9))
no_sched_irq_time = 1;
+ if (!strcmp(str, "unstable"))
+ mark_tsc_unstable("boot parameter");
return 1;
}
Patches currently in stable-queue which might be from peterz(a)infradead.org are
queue-3.18/perf-core-correct-event-creation-with-perf_format_group.patch
queue-3.18/x86-tsc-provide-tsc-unstable-boot-parameter.patch