Hello Sarbojit,
About HW Setup: 1. ARM target board 2. Octa core (4 Big and 4 Small cores)
We have using Hackbench tool with default configuration for benchmarking the scheduler performance. As described in the mail, we took 10 attempt to check the output of the hackbench.
Thanks, Rahul Khandelwal
-----Original Message----- From: Sarbojit Ganguly [mailto:ganguly.s@samsung.com] Sent: 07 September 2015 09:03 To: Rahul Khandelwal (Rahul Khandelwal); Sarbojit Ganguly; Alex Shi; Jon Medhurst (Tixy) Cc: Hiren Bharatbhai Gajjar; khilman@linaro.org; Morten.Rasmussen@arm.com; vincent.guittot@linaro.org; Gaurav Jindal (Gaurav Jindal); linaro-kernel@lists.linaro.org; Sanjeev Yadav (Sanjeev Kumar Yadav); chris.redpath@arm.com; Patrick Bellasi; SHARAN ALLUR; VIKRAM MUPPARTHI; SUNEEL KUMAR SURIMANI Subject: Re: RE: Re: [PATCH 1/1] HMP: Do not send IPI if core already waked up
Hello Rahul,
Can you please share the set up of your benchmark? It will be easier to understand the scenario when following details are given; 1. HW set up (machine configuration) 2. What program was used to benchmark.
I am also preparing the same and will share the details soon.
Regards, Sarbojit
------- Original Message ------- Sender : Rahul Khandelwal (Rahul Khandelwal)Rahul.Khandelwal@spreadtrum.com Date : Sep 04, 2015 23:10 (GMT+05:30) Title : RE: Re: [PATCH 1/1] HMP: Do not send IPI if core already waked up
Dear All,
We further checked with some modifications in the code while making the decision for up migration.
In place of fetching the heaviest task on slower domain and then searching for idle CPU in big domain and then sending IPI to big CPU,
We now first look for the idle CPUs on the bigger domain and then send IPI to big idle CPUs till their exist any heavy task on slower domain CPUs.
We checked for the performance of this patch with the hackbench tool.
Results for 10 samples are;
Without patch
maximum time of execution 3.342secs minimum time of execution 3.313secs
average of 10 samples: 3.368secs
With patch
maximum time of execution 3.319secs minimum time of execution 3.071secs
average of 10 samples: 3.169secs
On an average with this patch we would have performance improvement of 5.90%.
This would essentially lead to have better responsiveness of scheduler to migrate the big tasks to big CPUs.
---------------------------------------------------------------------------------------------------------------------------------
From e1df1fc8f069aafc592b42bc3fcc7780f32048a7 Mon Sep 17 00:00:00 2001
From: rahulkhandelwal Date: Sat, 5 Sep 2015 01:35:23 +0800 Subject: [PATCH 1/1] HMP: send IPI to all available big idle CPUs till their exist any heavy task on slower domain CPUs
In place of fetching the heaviest task on slower domain and then searching for idle CPU in big domain and then sending IPI to big CPU, we now first look for the idle CPUs on the bigger domain and then send IPI to big idle CPUs till their exist any heavy task on slower domain CPUs.
CC: Sanjeev Yadav CC: Gaurav Jindal CC: sprd-ind-kernel-group@googlegroups.com Signed-off-by: rahulkhandelwal --- kernel/sched/fair.c | 188 ++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 155 insertions(+), 33 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 1baf641..a81ad6c 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7156,6 +7156,34 @@ out: put_task_struct(p); }
+/* + * Update idle cpu for hmp_domain(faster or slower) + * Return 1 for Success(found idle cpu) otherwise 0 */ + +static inline unsigned int hmp_domain_idle_cpu(struct hmp_domain *hmpd, +*idle_cpu, struct cpumask *affinity) { int cpu; unsigned long +contrib; struct sched_avg *avg; struct cpumask temp_cpumask; + + cpumask_and(&temp_cpumask, &hmpd->cpus, affinity ? affinity : + cpu_online_mask); + + for_each_cpu_mask(cpu, temp_cpumask) { avg = &cpu_rq(cpu)->avg; +contrib = avg->load_avg_ratio; if (!contrib && +!cpu_rq(cpu)->wake_for_idle_pull) { if (idle_cpu) *idle_cpu = cpu; +return 1; } } return 0; } + + static DEFINE_SPINLOCK(hmp_force_migration);
/* @@ -7164,23 +7192,125 @@ static DEFINE_SPINLOCK(hmp_force_migration); */ static void hmp_force_up_migration(int this_cpu) { - int cpu, target_cpu; + int cpu; + int target_cpu = NR_CPUS; struct sched_entity *curr, *orig; - struct rq *target; unsigned long flags; unsigned int force, got_target; struct task_struct *p; + struct cpumask idle_cpus = {0}; + struct sched_avg *avg; + struct sched_entity *se; + struct rq *rq; + int nr_count;
- if (!spin_trylock(&hmp_force_migration)) + struct cpumask *hmp_slowcpu_mask = NULL; struct cpumask + *hmp_fastcpu_mask = NULL; + u64 now; + + if (!spin_trylock(&hmp_force_migration)) return; - for_each_online_cpu(cpu) { + + if(hmp_cpu_is_slowest(this_cpu)) { + hmp_fastcpu_mask = &hmp_faster_domain(this_cpu)->cpus; + hmp_slowcpu_mask = &hmp_cpu_domain(this_cpu)->cpus; } else { + hmp_fastcpu_mask = &hmp_cpu_domain(this_cpu)->cpus; hmp_slowcpu_mask = + &hmp_slower_domain(this_cpu)->cpus; + } + + /* Check for idle_cpus in faster domain, */ + /* update the idle mask if load_avg_ratio = 0 */ for_each_cpu(cpu, + hmp_fastcpu_mask) { rq = cpu_rq(cpu); avg = &cpu_rq(cpu)->avg; + if(avg->load_avg_ratio == 0) cpumask_set_cpu(cpu, &idle_cpus); } + + /* Don't iterate for slower domain, if no idle cpu in faster domain*/ + if(cpumask_empty(&idle_cpus)) + goto DOWN; + + /* Iterate slower domain cpus for up migration*/ for_each_cpu(cpu, + hmp_slowcpu_mask) { + if(!cpu_online(cpu)) { + continue; + } + + nr_count = hmp_max_tasks; + rq = cpu_rq(cpu); + + raw_spin_lock_irqsave(&rq->lock, flags); + + curr = rq->cfs.curr; + if (!curr || rq->active_balance) { + raw_spin_unlock_irqrestore(&rq->lock, flags); continue; } if + (!entity_is_task(curr)) { struct cfs_rq *cfs_rq; cfs_rq = + group_cfs_rq(curr); while (cfs_rq) { curr = cfs_rq->curr; cfs_rq = + group_cfs_rq(curr); } } + + /* The currently running task is not on the runqueue */ se = + __pick_first_entity(cfs_rq_of(curr)); + while(!cpumask_empty(&idle_cpus) && nr_count-- && se) { struct + task_struct *p = task_of(se); + /* Check for eligible task for up migration */ if (entity_is_task(se) + && + se->avg.load_avg_ratio > hmp_up_threshold && + cpumask_intersects(hmp_fastcpu_mask, tsk_cpus_allowed(p))) { + +#ifdef CONFIG_SCHED_HMP_PRIO_FILTER + /* Filter by task priority */ + if (p->prio >= hmp_up_prio) { + se = __pick_next_entity(se); + continue; + } +#endif + + /* Let the task load settle before doing another up migration */ + /* hack - always use clock from first online CPU */ now = + cpu_rq(cpumask_first(cpu_online_mask))->clock_task; + if (((now - se->avg.hmp_last_up_migration) >> 10) < + hmp_next_up_threshold) { se = __pick_next_entity(se); continue; } + + /* Wakeup for eligible idle cpu in faster domain */ if + (hmp_domain_idle_cpu(hmp_faster_domain(cpu), &target_cpu, + tsk_cpus_allowed(p))) { + cpu_rq(target_cpu)->wake_for_idle_pull = 1; + cpumask_clear_cpu(target_cpu, &idle_cpus); /* Update idle cpumask */ + smp_send_reschedule(target_cpu); } } se = __pick_next_entity(se); } + raw_spin_unlock_irqrestore(&rq->lock, flags); } + +DOWN: + + for_each_cpu(cpu, hmp_fastcpu_mask) { + if(!cpu_online(cpu)) + continue; force = 0; got_target = 0; - target = cpu_rq(cpu); - raw_spin_lock_irqsave(&target->lock, flags); - curr = target->cfs.curr; - if (!curr || target->active_balance) { - raw_spin_unlock_irqrestore(&target->lock, flags); + rq = cpu_rq(cpu); + + raw_spin_lock_irqsave(&rq->lock, flags); + + curr = rq->cfs.curr; + if (!curr || rq->active_balance) { + raw_spin_unlock_irqrestore(&rq->lock, flags); continue; } if (!entity_is_task(curr)) { @@ -7193,19 +7323,7 @@ static void hmp_force_up_migration(int this_cpu) } } orig = curr; - curr = hmp_get_heaviest_task(curr, -1); - if (!curr) { - raw_spin_unlock_irqrestore(&target->lock, flags); - continue; - } - p = task_of(curr); - if (hmp_up_migration(cpu, &target_cpu, curr)) { - cpu_rq(target_cpu)->wake_for_idle_pull = 1; - raw_spin_unlock_irqrestore(&target->lock, flags); - spin_unlock(&hmp_force_migration); - smp_send_reschedule(target_cpu); - return; - } + if (!got_target) { /* * For now we just check the currently running task. @@ -7214,13 +7332,13 @@ static void hmp_force_up_migration(int this_cpu) */ curr = hmp_get_lightest_task(orig, 1); p = task_of(curr); - target->push_cpu = hmp_offload_down(cpu, curr); - if (target->push_cpu < NR_CPUS) { + rq->push_cpu = hmp_offload_down(cpu, curr); + if (rq->push_cpu < NR_CPUS) { get_task_struct(p); - target->migrate_task = p; + rq->migrate_task = p; got_target = 1; - trace_sched_hmp_migrate(p, target->push_cpu, HMP_MIGRATE_OFFLOAD); - hmp_next_down_delay(&p->se, target->push_cpu); + trace_sched_hmp_migrate(p, rq->push_cpu, HMP_MIGRATE_OFFLOAD); + hmp_next_down_delay(&p->se, rq->push_cpu); } } /* @@ -7229,24 +7347,28 @@ static void hmp_force_up_migration(int this_cpu) * CPU stopper take care of it. */ if (got_target) { - if (!task_running(target, p)) { + if (!task_running(rq, p)) { trace_sched_hmp_migrate_force_running(p, 0); - hmp_migrate_runnable_task(target); + hmp_migrate_runnable_task(rq); } else { - target->active_balance = 1; + rq->active_balance = 1; force = 1; } }
- raw_spin_unlock_irqrestore(&target->lock, flags); + raw_spin_unlock_irqrestore(&rq->lock, flags);
if (force) - stop_one_cpu_nowait(cpu_of(target), + stop_one_cpu_nowait(cpu_of(rq), hmp_active_task_migration_cpu_stop, - target, &target->active_balance_work); + rq, &rq->active_balance_work); } + + spin_unlock(&hmp_force_migration); + } + /* * hmp_idle_pull looks at little domain runqueues to see * if a task should be pulled. -- 1.7.9.5
---------------------------------------------------------------------------------------------------------------
-----Original Message----- From: Sarbojit Ganguly [mailto:ganguly.s@samsung.com] Sent: 04 September 2015 08:46 To: Alex Shi; Rahul Khandelwal (Rahul Khandelwal); Sarbojit Ganguly; Jon Medhurst (Tixy) Cc: Hiren Bharatbhai Gajjar; khilman@linaro.org; Morten.Rasmussen@arm.com; vincent.guittot@linaro.org; Gaurav Jindal (Gaurav Jindal); linaro-kernel@lists.linaro.org; Sanjeev Yadav (Sanjeev Kumar Yadav); chris.redpath@arm.com; Patrick Bellasi; SHARAN ALLUR; VIKRAM MUPPARTHI; SUNEEL KUMAR SURIMANI Subject: Re: Re: [PATCH 1/1] HMP: Do not send IPI if core already waked up
Rahul,
I will run hackbench with your patch and share the results. Lets see if we can get some measurable differences on the performance side.
Regards, Sarbojit
------- Original Message ------- Sender : Alex Shi Date : Aug 28, 2015 12:25 (GMT+05:30) Title : Re: [PATCH 1/1] HMP: Do not send IPI if core already waked up
Seems it is big saving...
On 08/25/2015 11:58 PM, Rahul Khandelwal (Rahul Khandelwal) wrote:
Dear Sarbojit,
Agreed with your point, thanks for raising it..
There was one issue in the total no of interrupt count in the previous chart preparation, actually the total no of interrupt is equal to the interrupt sent to Idle core.
On an average we are saving more than 25% of the IPIs sent to big domain CPUs. I just updated the below mentioned chart with the related data.
????? ??? ?? ?? ?? ----------------------------------------------------------------------+ The Tao lies beyond Yin and Yang. It is silent and still as a pool of water. | It does not seek fame, therefore nobody knows its presence. | It does not seek fortune, for it is complete within itself. | It exists beyond space and time. | ----------------------------------------------------------------------+
????? ??? ?? ?? ?? ----------------------------------------------------------------------+ The Tao lies beyond Yin and Yang. It is silent and still as a pool of water. | It does not seek fame, therefore nobody knows its presence. | It does not seek fortune, for it is complete within itself. | It exists beyond space and time. | ----------------------------------------------------------------------+