 
            This patchset optimizes the generic sched_clock implementation to significantly reduce the data cache profile. It also makes it safe to call sched_clock() from NMI (or FIQ on ARM).
The data cache profile of sched_clock() in both the original code and my previous patch was somewhere between 2 and 3 (64-byte) cache lines, depending on alignment of struct clock_data. After patching, the cache profile for the normal case should be a single cacheline.
NMI safety was tested on i.MX6 with perf drowning the system in FIQs and using the perf handler to check that sched_clock() returned monotonic values. At the same time I forcefully reduced kt_wrap so that update_sched_clock() is being called at >1000Hz.
Without the patches the above system is grossly unstable, surviving [9K,115K,25K] perf event cycles during three separate runs. With the patch I ran for over 9M perf event cycles before getting bored.
v3: * Optimized to minimise cache profile, including elimination of the suspended flag (Thomas Gleixner). * Replaced the update_bank_begin/end with a single update function (Thomas Gleixner). * Split into multiple patches to aid review.
v2:
* Extended the scope of the read lock in sched_clock() so we can bank all data consumed there (John Stultz)
Daniel Thompson (4): sched_clock: Match scope of read and write seqcounts sched_clock: Optimize cache line usage sched_clock: Remove suspend from clock_read_data sched_clock: Avoid deadlock during read from NMI
kernel/time/sched_clock.c | 163 ++++++++++++++++++++++++++++++---------------- 1 file changed, 107 insertions(+), 56 deletions(-)
-- 1.9.3