Greeting,
FYI, we noticed a -6.8% regression of vm-scalability.median due to commit:
commit: a449d926203c733fecddc1fd077cf3d6ecef7f15 ("[PATCH] perf/core: fix corner case in perf_rotate_context()") url: https://github.com/0day-ci/linux/commits/Song-Liu/perf-core-fix-corner-case-...
in testcase: vm-scalability on test machine: 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 8G memory with following parameters:
runtime: 300s size: 8T test: anon-w-seq cpufreq_governor: performance ucode: 0x21
test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us. test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
If you fix the issue, kindly add following tag Reported-by: kernel test robot rong.a.chen@intel.com
Details are as below: -------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml
========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode: gcc-7/performance/x86_64-rhel-7.6/debian-x86_64-2019-09-23.cgz/300s/8T/lkp-ivb-d02/anon-w-seq/vm-scalability/0x21
commit: e336b40277 ("kprobes: Prohibit probing on BUG() and WARN() address") a449d92620 ("perf/core: fix corner case in perf_rotate_context()")
e336b4027775cb45 a449d926203c733fecddc1fd077 ---------------- --------------------------- %stddev %change %stddev \ | \ 0.00 ± 2% +3.8% 0.00 vm-scalability.free_time 1235827 -6.8% 1151821 vm-scalability.median 0.02 ± 8% +51.1% 0.03 ± 5% vm-scalability.median_stddev 4931119 -6.6% 4606513 vm-scalability.throughput 1624883 -6.4% 1521193 vm-scalability.time.minor_page_faults 228.43 ± 3% +20.6% 275.45 vm-scalability.time.system_time 919.50 -5.6% 867.64 vm-scalability.time.user_time 11336 -3.9% 10895 vm-scalability.time.voluntary_context_switches 1.432e+09 -7.1% 1.33e+09 vm-scalability.workload 2.07 ± 4% +0.7 2.81 ± 14% mpstat.cpu.all.idle% 20.04 ± 3% +3.8 23.89 mpstat.cpu.all.sys% 77.00 -5.8% 72.50 vmstat.cpu.us 7171 -3.0% 6958 vmstat.system.cs 12591513 ± 16% +60.0% 20142845 ± 27% cpuidle.C6.time 21824 ± 11% +44.6% 31557 ± 19% cpuidle.C6.usage 1123 ± 17% +86.2% 2092 ± 24% cpuidle.POLL.usage 8568 +11.3% 9539 interrupts.CAL:Function_call_interrupts 7614 ± 33% +87.5% 14280 ± 25% interrupts.CPU3.RES:Rescheduling_interrupts 15566 ± 20% +120.5% 34321 ± 28% interrupts.NMI:Non-maskable_interrupts 15566 ± 20% +120.5% 34321 ± 28% interrupts.PMI:Performance_monitoring_interrupts 37492 ± 9% +33.8% 50168 ± 2% interrupts.RES:Rescheduling_interrupts 31894 ± 6% -19.7% 25618 ± 5% softirqs.CPU0.RCU 32409 ± 6% -21.5% 25453 ± 8% softirqs.CPU1.RCU 35578 ± 3% -26.0% 26329 ± 11% softirqs.CPU2.RCU 33765 ± 9% -28.8% 24030 ± 3% softirqs.CPU3.RCU 133648 ± 4% -24.1% 101433 ± 5% softirqs.RCU 21819 ± 11% +44.6% 31556 ± 19% turbostat.C6 1.04 ± 16% +0.6 1.66 ± 26% turbostat.C6% 0.27 ± 29% +247.2% 0.94 ± 41% turbostat.CPU%c6 18.86 -2.1% 18.46 turbostat.CorWatt 0.02 ± 44% +520.0% 0.15 ± 40% turbostat.Pkg%pc2 0.02 ±101% +1466.7% 0.35 ± 63% turbostat.Pkg%pc6 37.19 -1.5% 36.64 turbostat.PkgWatt 2167377 +51.8% 3289728 ± 2% meminfo.Active 2165895 +51.8% 3288244 ± 2% meminfo.Active(anon) 2006718 +53.7% 3085007 ± 2% meminfo.AnonHugePages 2151900 +49.3% 3213568 ± 2% meminfo.AnonPages 141552 -60.7% 55678 ± 8% meminfo.CmaFree 4584114 -24.5% 3458742 meminfo.MemAvailable 4694537 -24.0% 3569088 meminfo.MemFree 3337073 +33.7% 4462514 meminfo.Memused 7677 +27.5% 9791 meminfo.PageTables 9532 ± 12% -38.9% 5820 ± 20% sched_debug.cfs_rq:/.min_vruntime.stddev 29.67 ± 4% -17.8% 24.38 ± 3% sched_debug.cfs_rq:/.nr_spread_over.max 11.21 ± 6% -16.5% 9.36 ± 2% sched_debug.cfs_rq:/.nr_spread_over.stddev 103675 ± 28% -22.0% 80844 ± 6% sched_debug.cfs_rq:/.runnable_weight.stddev 20991 ± 40% -72.0% 5881 ± 52% sched_debug.cfs_rq:/.spread0.max 9528 ± 12% -38.9% 5820 ± 20% sched_debug.cfs_rq:/.spread0.stddev 371922 ± 10% +22.9% 456951 ± 8% sched_debug.cpu.avg_idle.min 191541 ± 12% -19.1% 154992 ± 8% sched_debug.cpu.avg_idle.stddev 318.92 ± 68% -59.2% 130.17 ±124% sched_debug.cpu.curr->pid.stddev 540668 +35.4% 732070 ± 3% proc-vmstat.nr_active_anon 536027 +30.6% 700171 ± 4% proc-vmstat.nr_anon_pages 977.25 +34.4% 1313 ± 4% proc-vmstat.nr_anon_transparent_hugepages 114025 -16.8% 94860 ± 2% proc-vmstat.nr_dirty_background_threshold 228332 -16.8% 189953 ± 2% proc-vmstat.nr_dirty_threshold 35358 -49.5% 17856 ± 8% proc-vmstat.nr_free_cma 1174409 -16.3% 982468 ± 2% proc-vmstat.nr_free_pages 1920 +17.3% 2252 ± 2% proc-vmstat.nr_page_table_pages 6783 +3.2% 7000 proc-vmstat.nr_shmem 540670 +35.4% 732059 ± 3% proc-vmstat.nr_zone_active_anon 2505028 -5.5% 2366487 proc-vmstat.numa_hit 2505028 -5.5% 2366487 proc-vmstat.numa_local 1609 ± 5% +30.7% 2103 ± 10% proc-vmstat.pgactivate 72313983 -7.6% 66816607 ± 2% proc-vmstat.pgalloc_dma32 2.472e+08 -6.9% 2.301e+08 proc-vmstat.pgalloc_normal 2007636 -5.0% 1907338 proc-vmstat.pgfault 3.193e+08 -7.2% 2.963e+08 proc-vmstat.pgfree 619835 -7.2% 575294 proc-vmstat.thp_deferred_split_page 620290 -7.1% 576268 proc-vmstat.thp_fault_alloc 4.948e+09 -8.6% 4.525e+09 perf-stat.i.branch-instructions 71787571 -6.5% 67122869 perf-stat.i.cache-misses 79483511 -7.0% 73906455 perf-stat.i.cache-references 7224 -2.9% 7017 perf-stat.i.context-switches 0.82 +7.9% 0.88 perf-stat.i.cpi 1.272e+10 -1.1% 1.258e+10 perf-stat.i.cpu-cycles 96.49 ± 10% -22.6% 74.70 ± 2% perf-stat.i.cpu-migrations 185.50 +3.7% 192.42 perf-stat.i.cycles-between-cache-misses 3.354e+09 -6.7% 3.129e+09 perf-stat.i.dTLB-loads 0.01 ± 5% +0.0 0.01 ± 4% perf-stat.i.dTLB-store-miss-rate% 1.419e+09 -6.9% 1.321e+09 perf-stat.i.dTLB-stores 1.558e+10 -7.8% 1.437e+10 perf-stat.i.instructions 1.22 -6.8% 1.14 perf-stat.i.ipc 6581 -5.4% 6226 perf-stat.i.minor-faults 6581 -5.4% 6226 perf-stat.i.page-faults 0.82 +7.2% 0.88 perf-stat.overall.cpi 177.25 +5.8% 187.47 perf-stat.overall.cycles-between-cache-misses 1.22 -6.7% 1.14 perf-stat.overall.ipc 4.936e+09 -8.6% 4.513e+09 perf-stat.ps.branch-instructions 71615773 -6.5% 66940324 perf-stat.ps.cache-misses 79293323 -7.0% 73705360 perf-stat.ps.cache-references 7207 -2.9% 6998 perf-stat.ps.context-switches 1.269e+10 -1.1% 1.255e+10 perf-stat.ps.cpu-cycles 96.25 ± 10% -22.6% 74.50 ± 2% perf-stat.ps.cpu-migrations 3.346e+09 -6.7% 3.12e+09 perf-stat.ps.dTLB-loads 1.416e+09 -6.9% 1.318e+09 perf-stat.ps.dTLB-stores 1.554e+10 -7.8% 1.433e+10 perf-stat.ps.instructions 6565 -5.4% 6209 perf-stat.ps.minor-faults 6565 -5.4% 6209 perf-stat.ps.page-faults 4.691e+12 -7.6% 4.333e+12 perf-stat.total.instructions 67.81 -3.9 63.88 perf-profile.calltrace.cycles-pp.do_rw_once 0.89 ± 8% -0.3 0.54 ± 57% perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.do_rw_once 0.92 ± 8% -0.2 0.69 ± 15% perf-profile.calltrace.cycles-pp.apic_timer_interrupt.do_rw_once 45.88 +3.6 49.44 perf-profile.calltrace.cycles-pp.do_access 14.92 ± 4% +6.1 20.99 ± 4% perf-profile.calltrace.cycles-pp.clear_page_erms.clear_subpage.clear_huge_page.do_huge_pmd_anonymous_page.__handle_mm_fault 15.42 ± 3% +6.3 21.76 ± 4% perf-profile.calltrace.cycles-pp.clear_subpage.clear_huge_page.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault 15.92 ± 3% +6.6 22.52 ± 4% perf-profile.calltrace.cycles-pp.clear_huge_page.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.__do_page_fault 17.18 ± 3% +6.6 23.79 ± 4% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 17.00 ± 3% +6.7 23.68 ± 4% perf-profile.calltrace.cycles-pp.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 16.45 +7.0 23.43 perf-profile.calltrace.cycles-pp.page_fault.do_access 16.29 +7.0 23.27 perf-profile.calltrace.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault.do_access 16.39 +7.0 23.38 perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault.do_access 16.43 +7.0 23.42 perf-profile.calltrace.cycles-pp.do_page_fault.page_fault.do_access 68.56 -4.7 63.88 perf-profile.children.cycles-pp.do_rw_once 1.62 ± 4% -0.3 1.31 ± 13% perf-profile.children.cycles-pp.smp_apic_timer_interrupt 1.73 ± 4% -0.3 1.42 ± 12% perf-profile.children.cycles-pp.apic_timer_interrupt 1.15 ± 6% -0.2 0.90 ± 13% perf-profile.children.cycles-pp.__hrtimer_run_queues 0.83 ± 3% -0.2 0.63 ± 17% perf-profile.children.cycles-pp.tick_sched_timer 0.66 ± 12% -0.2 0.51 ± 18% perf-profile.children.cycles-pp.update_process_times 0.24 ± 32% -0.1 0.11 ± 15% perf-profile.children.cycles-pp.irq_work_run_list 0.19 ± 21% -0.1 0.09 ± 21% perf-profile.children.cycles-pp.irq_work_interrupt 0.19 ± 21% -0.1 0.09 ± 21% perf-profile.children.cycles-pp.smp_irq_work_interrupt 0.18 ± 22% -0.1 0.09 ± 21% perf-profile.children.cycles-pp.irq_work_run 0.18 ± 22% -0.1 0.09 ± 21% perf-profile.children.cycles-pp.printk 0.10 ± 14% -0.0 0.07 ± 25% perf-profile.children.cycles-pp.try_to_wake_up 0.26 ± 13% +0.0 0.31 ± 4% perf-profile.children.cycles-pp.__free_pages_ok 0.34 ± 8% +0.1 0.41 ± 2% perf-profile.children.cycles-pp.__x64_sys_munmap 0.34 ± 8% +0.1 0.41 ± 3% perf-profile.children.cycles-pp.__vm_munmap 0.32 ± 11% +0.1 0.39 ± 2% perf-profile.children.cycles-pp.tlb_finish_mmu 0.32 ± 11% +0.1 0.39 ± 2% perf-profile.children.cycles-pp.tlb_flush_mmu 0.33 ± 12% +0.1 0.40 ± 3% perf-profile.children.cycles-pp.release_pages 0.34 ± 8% +0.1 0.42 ± 4% perf-profile.children.cycles-pp.__do_munmap 0.34 ± 8% +0.1 0.41 ± 3% perf-profile.children.cycles-pp.unmap_region 0.28 ± 9% +0.1 0.37 ± 5% perf-profile.children.cycles-pp.munmap 0.18 ± 8% +0.1 0.29 ± 9% perf-profile.children.cycles-pp._cond_resched 0.08 ± 26% +0.1 0.19 ± 12% perf-profile.children.cycles-pp._raw_spin_lock_irqsave 0.12 ± 15% +0.1 0.24 ± 13% perf-profile.children.cycles-pp.rcu_all_qs 0.30 ± 12% +0.1 0.43 ± 5% perf-profile.children.cycles-pp.___might_sleep 0.00 +0.1 0.15 ± 7% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 45.22 +4.3 49.49 perf-profile.children.cycles-pp.do_access 15.19 ± 3% +6.4 21.58 ± 3% perf-profile.children.cycles-pp.clear_page_erms 15.47 ± 3% +6.6 22.03 ± 3% perf-profile.children.cycles-pp.clear_subpage 17.56 ± 3% +6.8 24.35 ± 3% perf-profile.children.cycles-pp.page_fault 15.98 ± 3% +6.8 22.78 ± 3% perf-profile.children.cycles-pp.clear_huge_page 17.35 ± 3% +6.8 24.15 ± 3% perf-profile.children.cycles-pp.handle_mm_fault 17.30 ± 3% +6.8 24.10 ± 3% perf-profile.children.cycles-pp.__handle_mm_fault 17.51 ± 3% +6.8 24.32 ± 3% perf-profile.children.cycles-pp.do_page_fault 17.47 ± 3% +6.8 24.29 ± 3% perf-profile.children.cycles-pp.__do_page_fault 17.02 ± 2% +6.9 23.91 ± 3% perf-profile.children.cycles-pp.do_huge_pmd_anonymous_page 59.31 -4.4 54.88 perf-profile.self.cycles-pp.do_rw_once 19.43 ± 2% -1.9 17.50 perf-profile.self.cycles-pp.do_access 0.23 ± 12% +0.1 0.29 ± 7% perf-profile.self.cycles-pp.__free_pages_ok 0.29 ± 10% +0.1 0.40 ± 7% perf-profile.self.cycles-pp.___might_sleep 0.07 ± 7% +0.1 0.18 ± 11% perf-profile.self.cycles-pp.rcu_all_qs 0.00 +0.1 0.15 ± 7% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 0.31 ± 19% +0.2 0.51 ± 13% perf-profile.self.cycles-pp.clear_subpage 14.82 ± 3% +6.4 21.21 ± 3% perf-profile.self.cycles-pp.clear_page_erms
vm-scalability.time.user_time
940 +-+-------------------------------------------------------------------+ | + | 930 +-+ + + +. .. + + + | 920 +-++. .. + .+ .. + + + +..+.+.. + : + :| |.. +..+.+ +. + .+.+ +..+. .+. + +. + : + :| 910 +-+ +. +. + + + | | | 900 +-+ | | | 890 +-+ | 880 +-+ | O | 870 +-+ O O O O O O O O O O O | | O O O O O O O O O | 860 +-+----------O-----------O-------------------O---------O--------------+
vm-scalability.time.system_time
290 +-+-------------------------------------------------------------------+ | O | 280 +-+O O O O O O O O O | 270 O-+ O O O O O O O O O O O O O O | | | 260 +-+ | | | 250 +-+ | | | 240 +-+ +.. + .+ | 230 +-+ +..+. +..+.+.. +.. + + + .+..+ + +.. :| | + +.. .+.. + + + + + .+.. .+..+ + + :| 220 +-++ + + + + + + + | | | 210 +-+-------------------------------------------------------------------+
vm-scalability.throughput
5e+06 +-+--------------------------------------------------------------+ | | 4.95e+06 +-+ .+. .+.+.+.. .+..+. .+. .+.+..+. .+.| | +. +.+. +.+ +.+. + +.+..+. .+..+.+.+. | 4.9e+06 +-+ + | 4.85e+06 +-+ | | | 4.8e+06 +-+ | | | 4.75e+06 +-+ | 4.7e+06 +-+ | | | 4.65e+06 +-+ | | O O O O O O | 4.6e+06 O-O----O----O-O-O--O-O-O------O--O-----O--O-O-O--O-O-O--O--------+
vm-scalability.median
1.25e+06 +-+--------------------------------------------------------------+ 1.24e+06 +-+ .+. +. .+.. .+. .+. | | +. +. .. + +.+.+..+.+.+. + +..+.+.+..+. .+.+..+.| 1.23e+06 +-+ + +.+..+ | 1.22e+06 +-+ | | | 1.21e+06 +-+ | 1.2e+06 +-+ | 1.19e+06 +-+ | | | 1.18e+06 +-+ | 1.17e+06 +-+ | | | 1.16e+06 O-+ O O O O O O O | 1.15e+06 +-O----O----O-O----O-O-O------O--O-----O--O-O-O--O-O-O--O--------+
vm-scalability.workload
1.46e+09 +-+--------------------------------------------------------------+ | | 1.44e+09 +-+ .+. +. .+.. .+. .+. .+. .+. | | +. +. .. + +.+.+. +.+. + +. +.+..+.+.+..+.+.+..+.| 1.42e+09 +-+ + | | | 1.4e+09 +-+ | | | 1.38e+09 +-+ | | | 1.36e+09 +-+ | | | 1.34e+09 O-+ O O | | O O O O O O O O O O O O O O O O O O O O O | 1.32e+09 +-+---------O----------------------------------------------------+
[*] bisect-good sample [O] bisect-bad sample
Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance.
Thanks, Rong Chen