Re: [Sched-tools] [PATCH v2 04/14] rt-app: add wake-up latency to the log metrics

22 Nov 2016

On 22/11/16 14:51, Vincent Guittot wrote:
...
On 22 November 2016 at 13:32, Juri Lelli juri.lelli@arm.com wrote:
...
On 22/11/16 11:54, Patrick Bellasi wrote:
...
On 22-Nov 12:01, Vincent Guittot wrote:
[...]
...
...
The example above is:
start run 1 40ms after the start of run 0
I would say run1 starts 20ms after the start of run0, isn't it?
...
and start next event 20ms after the start of run1
mmm... not following here.
...
so you will not be
impacted by the real duration of run0 and run1 (because of freq
scaling or micro arch).
Do this sequence 2 times then start the next sequence/phase
To me this is the expected behavior:
+-----+   +----------+        +-----+   +----------+
 |r0   |   |r1        |        |r0   |   |r1        |
 |     |   |          |        |     |   |          |
 +----------------------------------------------------------->
 0    10   20   30   40   50   60   70   80   100  120
           ^                   ^
           |                   |
           +                   +
          Timer0              Timer1
So, if we really need this level of flexibility we can add up periods as
well (as Vincent was already saying) so that we end up we a row in the
log for each phase reporting a "super-period" of 60ms.
If we agree that this makes sense, what we do for the slack? One
possibility could be to start adding up there as well, so that the
subslack of the first timer and the subslack of the second one are
reflected in the slack reported in the per-phase row.
On thing that might be counter-intuitive looking at traces though is
that if, for example, the first run executes for more that the first
period (say 25 ms, e.g. frequency was low) it will report a -5ms slack
which adds up with the second sub slack of r1 (which instead recovered
and effectively executed for 20ms. So, now we report a 10ms overall
slack, while in the trace we see a 15ms gap before the next event.
+------------+----------+     +-----+   +----------+
  |r0          |r1        |     |r0   |   |r1        |
  |            |          |     |     |   |          |
  +----------------------------------------------------------->
  0    10   20   30   40   50   60   70   80   100  120
            ^                   ^
            |                   |
            +                   +
           Timer0              Timer1
Oh, BTW, I also think that the following change is actually needed,
otherwise the first run overshooting will influence the super-period.
Agree?
--->8---
From 52492c9ba118b686b76fcda05389a3a9c7c73120 Mon Sep 17 00:00:00 2001
From: Juri Lelli juri.lelli@arm.com
Date: Tue, 22 Nov 2016 12:29:27 +0000
Subject: [PATCH] rt-app: don't let overshooting run phases affect timers
A run phase that overshoots might run for more that the next timer
event. If that happens we still want for timers to not be affect by this
happening.
Don't use current clock for setting the next timer event, always refer
to t_first.
Signed-off-by: Juri Lelli juri.lelli@arm.com
src/rt-app.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/rt-app.c b/src/rt-app.c
index bec511121d77..36013758f718 100644
--- a/src/rt-app.c
+++ b/src/rt-app.c
@@ -338,7 +338,6 @@ static int run_event(event_data_t *event, int dry_run,
                                t_wu = timespec_sub(&t_now, &rdata->res.timer.t_next);
                                ldata->wu_latency = timespec_to_usec(&t_wu);
                        } else {

                          clock_gettime(CLOCK_MONOTONIC, &rdata->res.timer.t_next);



The 1st implementation didn't update t_next when timer has already
expired. But then following timers were also noted as expired because
the prev period has stolen time whereas the run time was correct.
We faced the issue while testing frequency scaling policy: the
governor was slow to react to the 1st period, the overshoot during
this 1st period was so huge compared the margin of the use case than
the next periods were already expired when the run has finished
whereas the governor has reacted and increased freq to max. In fact,
the delay generated by the 1st period was so huge  that it was not
compensated after the result was meaningless because 1 period was
really wrong but the following one were also seen as wrong
As an example: run 9ms in a period of 10ms. Your platform starts at
lowest OPP which is 10 time slower than max OPP. If the 1st run event
takes something like 50ms (the time for the governor to react), we
have an overshoot of 40ms. This means that even if we are now at max
OPP and the run event really takes 9ms, the next 40 periods will be
seen as wrong because we will set timer in the past.
OK, right. I think we discussed already this point in the past (and
didn't agree on a common solution :). Since now we have actually imposed
and used the "cumulative" approach (where successive activation get
penalized potentially infinitely), it models a more strict behaviour by
always keeping an absolute (t_first) reference no matter what happens to
run phases.
Since I think that both approaches might be useful to implement, why
don't we add a "mode" : [absolute|relative] to timers and than we check
the mode when executing the event to decide if we want to use current
clock for t_next or keep t_first as reference?

    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Sched-tools] [PATCH v2 04/14] rt-app: add wake-up latency to the log metrics

Signed-off-by: Juri Lelli juri.lelli@arm.com