When a cpu enters a deep idle state, the local timers are stopped and
the time framework falls back to the timer device used as a broadcast
timer.
The different cpuidle drivers are calling clockevents_notify ENTER/EXIT
when the idle state stops the local timer.
Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle
drivers. If the flag is set, the cpuidle core code takes care of the
notification on behalf of the driver to avoid pointless code duplication.
Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Len Brown <lenb(a)kernel.org>
Cc: Linus Walleij <linus.walleij(a)linaro.org>
Cc: Santosh Shilimkar <santosh.shilimkar(a)ti.com>
Cc: Rajendra Nayak <rnayak(a)ti.com>
Cc: Sascha Hauer <kernel(a)pengutronix.de>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
---
drivers/cpuidle/cpuidle.c | 9 +++++++++
include/linux/cpuidle.h | 1 +
2 files changed, 10 insertions(+)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index eba6929..c500370 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -8,6 +8,7 @@
* This code is licenced under the GPL.
*/
+#include <linux/clockchips.h>
#include <linux/kernel.h>
#include <linux/mutex.h>
#include <linux/sched.h>
@@ -146,12 +147,20 @@ int cpuidle_idle_call(void)
trace_cpu_idle_rcuidle(next_state, dev->cpu);
+ if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP)
+ clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
+ &dev->cpu);
+
if (cpuidle_state_is_coupled(dev, drv, next_state))
entered_state = cpuidle_enter_state_coupled(dev, drv,
next_state);
else
entered_state = cpuidle_enter_state(dev, drv, next_state);
+ if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP)
+ clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT,
+ &dev->cpu);
+
trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
/* give the governor an opportunity to reflect on the outcome */
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 480c14d..a837b33 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -57,6 +57,7 @@ struct cpuidle_state {
/* Idle State Flags */
#define CPUIDLE_FLAG_TIME_VALID (0x01) /* is residency time measurable? */
#define CPUIDLE_FLAG_COUPLED (0x02) /* state applies to multiple cpus */
+#define CPUIDLE_FLAG_TIMER_STOP (0x04) /* timer is stopped on this state */
#define CPUIDLE_DRIVER_FLAGS_MASK (0xFFFF0000)
--
1.7.9.5
commit 91d1aa43 (context_tracking: New context tracking susbsystem)
generalized parts of the RCU userspace extended quiescent state into
the context tracking subsystem. Context tracking is then used
to implement adaptive tickless (a.k.a full nohz)
Mainline currently only includes x86 support for the context tracking
susbsystem and the goal of this series is to add ARM support, in order
to make adaptive tickless functional on ARM.
Depends on the prerequistes series:
[PATCH 0/3] ARM: context tracking support prerequisites
http://marc.info/?l=linux-kernel&m=136382248131438&w=2
Both of which are combined on top of Frederic's 3.9-rc1-nohz1 branch
and available here:
git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux.git arm-nohz-v2/context-tracking
Using this, I tested adaptive tickless on a 2 CPU ARM SoC (OMAP4
Panda.)
Kevin Hilman (4):
ARM: context tracking: add exception support
ARM: context tracking: instrument system calls
ARM: context tracking: handle post exception/syscall/IRQ work
ARM: Kconfig: allow context tracking
arch/arm/Kconfig | 1 +
arch/arm/include/asm/thread_info.h | 4 +++-
arch/arm/kernel/ptrace.c | 11 +++++++++++
arch/arm/kernel/signal.c | 12 +++++++++---
arch/arm/kernel/traps.c | 18 +++++++++++++++++-
arch/arm/mm/fault.c | 32 +++++++++++++++++++++++++++-----
6 files changed, 68 insertions(+), 10 deletions(-)
--
1.8.2
In multiprocessor systems with cpus with different compute capabilities it is
essential for performance that heavy tasks are scheduled on the most capable
cpus. The current scheduler does not handle such performance heterogeneous
systems optimally. This patch set proposes a small set of changes that
significantly improves performance on these systems.
Looking at the current scheduler design the most obvious way to represent the
compute capability of each individual cpu is to use cpu_power as this is
already used for load-balancing. The recently included entity load-tracking
adds the infrastructure to distinguish between heavy and light tasks.
The proposed changes moves heavy tasks to cpus with higher cpu_power to get
better performance and fixes load-balancing issues for caused by the cpu_power
difference when having one heavy task per cpu.
The patches requires load-balancing to be based on entity load-tracking and
there uses Alex Shi's patch set as the starting point:
https://lkml.org/lkml/2013/1/25/767
The patches are based in 3.9-rc2 and have been tested on an ARM vexpress TC2
big.LITTLE testchip containing five cpus: 2xCortex-A15 + 3xCortex-A7.
Additional testing and refinements might be needed later as more sophisticated
platforms become available.
cpu_power A15: 1441
cpu_power A7: 606
Benchmarks:
cyclictest: cyclictest -a -t 2 -n -D 10
hackbench: hackbench (default settings)
sysbench_1t: sysbench --test=cpu --num-threads=1 --max-requests=1000 run
sysbench_2t: sysbench --test=cpu --num-threads=2 --max-requests=1000 run
sysbench_5t: sysbench --test=cpu --num-threads=5 --max-requests=1000 run
Mixed cpu_power:
Average times over 20 runs normalized to 3.9-rc2 (lower is better):
3.9-rc2 +shi +shi+patches Improvement
cyclictest
AVG 74.9 74.5 75.75 -1.13%
MIN 69 69 69
MAX 88 88 94
hackbench
AVG 2.17 2.09 2.09 3.90%
MIN 2.10 1.95 2.02
MAX 2.25 2.48 2.17
sysbench_1t
AVG 25.13* 16.47' 16.48 34.43%
MIN 16.47 16.47 16.47
MAX 33.78 16.48 16.54
sysbench_2t
AVG 19.32 18.19 16.51 14.55%
MIN 16.48 16.47 16.47
MAX 22.15 22.19 16.61
sysbench_5t
AVG 27.22 27.71 24.14 11.31%
MIN 25.42 27.66 24.04
MAX 27.75 27.86 24.31
* The unpatched 3.9-rc2 scheduler gives inconsistent performance as tasks may
randomly be placed on either A7 or A15 cores. The max/min values reflects this
behaviour. A15 and A7 performance are ~16.5 and ~33.5 respectively.
' While Alex Shi's patches appear to solve the performance inconsistency for
sysbench_1t, it is not the true picture for all workloads. This can be seen for
sysbench_2t.
To ensure that the proposed changes does not affect normal SMP systems, the
same benchmarks have been run on a 2xCortex-A15 configuration as well:
SMP:
Average times over 20 runs normalized to 3.9-rc2 (lower is better):
3.9-rc2 +shi +shi+patches Improvement
cyclictest
AVG 78.6 75.3 77.6 1.34%
MIN 69 69 69
MAX 135 98 125
hackbench
AVG 3.55 3.54 3.55 0.06%
MIN 3.51 3.48 3.49
MAX 3.66 3.65 3.67
sysbench_1t
AVG 16.48 16.48 16.48 -0.03%
MIN 16.47 16.48 16.48
MAX 16.49 16.48 16.48
sysbench_2t
AVG 16.53 16.53 16.54 -0.05%
MIN 16.47 16.47 16.48
MAX 16.59 16.57 16.59
sysbench_5t
AVG 41.16 41.15 41.15 0.04%
MIN 41.14 41.13 41.11
MAX 41.35 41.19 41.17
Note:
The cpu_power setup code is already present in 3.9-rc2, but the device
tree provided for ARM vexpress TC2 is missing frequency information. Adding
this will give the cpu_powers listed above.
Morten
Morten Rasmussen (1):
sched: Pull tasks from cpus with multiple tasks when idle
Vincent Guittot (1):
sched: Force migration on a better cpu
kernel/sched/fair.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 53 insertions(+), 4 deletions(-)
--
1.7.9.5
Guenter,
Sorry for the delay due to our internal issues. Please check this v5 patches,
thanks for all your reviews/comments for this patch set.
Anton Vorontsov,
I have add your Acked-by: into patch [1/5] (which was [1/3]); and split the
old [2/3] into new [2/5], [3/5] and [4/5] patches, so please have a look them
again, thank you.
v4 -> v5 changes:
- split the old [2/3]-ab8500-re-arrange-ab8500-power-and-temperature-data into
new three [2/5], [3/5] and [4/5] patches.
- hwmon driver minor coding style clean ups:
- {} usage in if-else statement in ab8500_read_sensor function
- index error fix in gpadc_monitor function
- fix issue of clamp_val() usage
- remove unnecessary else in function abx500_attrs_visible
- remove redundant print message about irq set up
- return the calling function return value directly in probe function
v3 -> v4 changes:
for patch [3/3]
- define delays in HZ
- update ab8500_read_sensor function, returning temp by parameter
- remove ab8500_is_visible function
- use clamp_val in set_min and set_max callback
- remove unnecessary locks in remove and suspend functions
- let abx500 and ab8500 use its own data structure
for patch [2/3]
- move the data tables from driver/power/ab8500_bmdata.c to
include/linux/power/ab8500.h
- rename driver/power/ab8500_bmdata.c to driver/power/ab8500_bm.c
- rename these variable names to eliminate CamelCase warnings
- add const attribute to these data
v2 -> v3 changes:
- Add interface for converting voltage to temperature
- Remove temp5 sensor since we cannot offer temperature read interface of it
- Update hyst to use absolute temperature instead of a difference
- Add the 3/3 patch
v1 -> v2 changes:
- Add Documentation/hwmon/abx500 and Documentation/hwmon/abx500
- Make devices which cannot report milli-Celsius invisible
- Add temp5_crit interface
- Re-work the old find_active_thresholds() to threshold_updated()
- Reset updated_min_alarm and updated_max_alarm at the end of each loop
- Update the hyst mechamisn to make it works as real hyst
- Remove non-stand attributes
- Re-order the operations sequence inside probe and remove functions
- Update all the lock usages to eliminate race conditions
- Make attibutes index starts from 0
also changes:
- Since the old [1/2] "ARM: ux500: rename ab8500 to abx500 for hwmon driver"
has been merged by Samuel, so won't send it again.
- Add another new patch "ab8500_btemp: export two symblols" as [2/2] of this
patch set.
Hongbo Zhang (5):
ab8500_btemp: make ab8500_btemp_get* interfaces public
ab8500: power: eliminate CamelCase warning of some variables
ab8500: power: add const attributes to some data arrays
ab8500: power: export abx500_res_to_temp tables for hwmon
hwmon: add ST-Ericsson ABX500 hwmon driver
Documentation/hwmon/ab8500 | 22 ++
Documentation/hwmon/abx500 | 28 ++
drivers/hwmon/Kconfig | 13 +
drivers/hwmon/Makefile | 1 +
drivers/hwmon/ab8500.c | 206 +++++++++++++++
drivers/hwmon/abx500.c | 491 +++++++++++++++++++++++++++++++++++
drivers/hwmon/abx500.h | 69 +++++
drivers/power/ab8500_bmdata.c | 38 +--
drivers/power/ab8500_btemp.c | 5 +-
drivers/power/ab8500_fg.c | 4 +-
include/linux/mfd/abx500.h | 6 +-
include/linux/mfd/abx500/ab8500-bm.h | 5 +
include/linux/power/ab8500.h | 16 ++
13 files changed, 881 insertions(+), 23 deletions(-)
create mode 100644 Documentation/hwmon/ab8500
create mode 100644 Documentation/hwmon/abx500
create mode 100644 drivers/hwmon/ab8500.c
create mode 100644 drivers/hwmon/abx500.c
create mode 100644 drivers/hwmon/abx500.h
create mode 100644 include/linux/power/ab8500.h
--
1.8.0
This patch series adds support for DRM FIMD DT for Exynos4 DT Machines,
specifically for Exynos4412 SoC.
changes since v8:
- addressed comments to add missing documentation for clock and clock-names
properties as pointed out by Sachin Kamat <sachin.kamat(a)linaro.org>
changes since v7:
- rebased to kgene's "for-next"
- Migrated to Common Clock Framework
- removed the patch "ARM: dts: Add FIMD AUXDATA node entry for exynos4 DT",
as migration to Common Clock Framework will NOT need this.
- addressed the comments raised by Sachin Kamat <sachin.kamat(a)linaro.org>
changes since v6:
- addressed comments and added interrupt-names = "fifo", "vsync", "lcd_sys"
in exynos4.dtsi and re-ordered the interrupt numbering to match the order in
interrupt combiner IP as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>.
changes since v5:
- renamed the fimd binding documentation file name as "samsung-fimd.txt",
since it not only talks about exynos display controller but also about
previous samsung display controllers.
- rephrased an abmigious statement about the interrupt combiner in the
fimd binding documentation as pointed out by
Sachin Kamat <sachin.kamat(a)linaro.org>
changes since v4:
- moved the fimd binding documentation to Documentation/devicetree/bindings/video/
as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>
- added more fimd compatiblity strings in fimd documentation as
discussed at https://patchwork.kernel.org/patch/2144861/ with
Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com> and
Tomasz Figa <tomasz.figa(a)gmail.com>
- modified compatible string for exynos4 fimd as "exynos4210-fimd"
exynos5 fimd as "exynos5250-fimd" to stick to the rule that compatible
value should be named after first specific SoC model in which this
particular IP version was included as discussed at
https://patchwork.kernel.org/patch/2144861/
- documented more about the interrupt combiner and their order as
suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>
changes since v3:
- rebased on
http://git.kernel.org/?p=linux/kernel/git/kgene/linux-samsung.git;a=shortlo…
changes since v2:
- added alias to 'fimd@11c00000' node
(reported by: Rahul Sharma <r.sh.open(a)gmail.com>)
- removed 'lcd0_data' node as there was already a similar node lcd_data24
(reported by: Jingoo Han <jg1.han(a)samsung.com>
- replaced spaces with tabs in display-timing node
changes since v1:
- added new patch to add FIMD DT binding Documentation
- removed patch enabling SAMSUNG_DEV_BACKLIGHT and SAMSUNG_DEV_PMW
for mach-exynos4 DT
- added 'status' property to fimd DT node
Is based on branch kgene's "for-next"
https://git.kernel.org/cgit/linux/kernel/git/kgene/linux-samsung.git/log/?h…
Sachin Kamat (1):
ARM: dts: Add lcd pinctrl node entries for EXYNOS4412 SoC
Vikas Sajjan (3):
ARM: dts: Add FIMD node to exynos4
ARM: dts: Add FIMD node and display timing node to
exynos4412-origen.dts
ARM: dts: Add FIMD DT binding Documentation
.../devicetree/bindings/video/samsung-fimd.txt | 68 ++++++++++++++++++++
arch/arm/boot/dts/exynos4.dtsi | 11 ++++
arch/arm/boot/dts/exynos4412-origen.dts | 22 +++++++
arch/arm/boot/dts/exynos4x12-pinctrl.dtsi | 14 ++++
4 files changed, 115 insertions(+)
create mode 100644 Documentation/devicetree/bindings/video/samsung-fimd.txt
--
1.7.9.5
=== David Long ===
=== Highlights ===
* I've set the uprobe xol/VM issue aside for the moment and have been
working on trying to separate the instruction simulation code from the
k/u-probe code so it may be more cleanly shared.
=== Plans ===
* Continue with the code restructuring, return to the xol issue.
=== Issues ===
-dl
With the addition of following patch:
fcf8058 cpufreq: Simplify cpufreq_add_dev()
cpufreq driver's .init() routine must initialize policy->cpus with mask of all
possible cpus (Online + Offline) that share the clock. Then the core would copy
this mask onto policy->related_cpus and will reset policy->cpus to carry only
online cpus.
acpi-cpufreq driver wasn't updated with this assumption and so sometimes when
we try to hot[un]plug cpus at run time, sysfs directories gets corrupted.
This patch fixes acpi-cpufreq driver against this corruption.
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
Tested-by: Maciej Rutecki <maciej.rutecki(a)gmail.com>
---
drivers/cpufreq/acpi-cpufreq.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index afbef9c..11b8b4b 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -723,7 +723,6 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
policy->shared_type == CPUFREQ_SHARED_TYPE_ANY) {
cpumask_copy(policy->cpus, perf->shared_cpu_map);
}
- cpumask_copy(policy->related_cpus, perf->shared_cpu_map);
#ifdef CONFIG_SMP
dmi_check_system(sw_any_bug_dmi_table);
@@ -735,7 +734,6 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
if (check_amd_hwpstate_cpu(cpu) && !acpi_pstate_strict) {
cpumask_clear(policy->cpus);
cpumask_set_cpu(cpu, policy->cpus);
- cpumask_copy(policy->related_cpus, cpu_sibling_mask(cpu));
policy->shared_type = CPUFREQ_SHARED_TYPE_HW;
pr_info_once(PFX "overriding BIOS provided _PSD data\n");
}
--
1.7.12.rc2.18.g61b472e
On Sat 23 Mar 2013 06:52:36 AM CST,
linaro-kernel-request(a)lists.linaro.org wrote:
> Send linaro-kernel mailing list submissions to
> linaro-kernel(a)lists.linaro.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.linaro.org/mailman/listinfo/linaro-kernel
> or, via email, send a message with subject or body 'help' to
> linaro-kernel-request(a)lists.linaro.org
>
> You can reach the person managing the list at
> linaro-kernel-owner(a)lists.linaro.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of linaro-kernel digest..."
>
>
> Today's Topics:
>
> 1. Re: [PATCH V3 5/7] mmc: queue work on any cpu (Viresh Kumar)
> 2. Re: [PATCH V3 5/7] mmc: queue work on any cpu (Chris Ball)
> 3. Re: [PATCH V3 5/7] mmc: queue work on any cpu (Chris Ball)
> 4. Re: [PATCH V3 5/7] mmc: queue work on any cpu (Chris Ball)
> 5. Re: [PATCH 2/2] PM / devfreq: tie suspend/resume to
> runtime-pm (Kevin Hilman)
> 6. [ACTIVITY] (Linus Walleij) 2013-02-22 - 2013-03-22 (Linus Walleij)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 22 Mar 2013 22:57:49 +0530
> From: Viresh Kumar <viresh.kumar(a)linaro.org>
> To: Chris Ball <cjb(a)laptop.org>
> Cc: venki(a)google.com, linaro-kernel(a)lists.linaro.org,
> suresh.b.siddha(a)intel.com, peterz(a)infradead.org, Liviu.Dudau(a)arm.com,
> robin.randhawa(a)arm.com, linux-kernel(a)vger.kernel.org,
> rostedt(a)goodmis.org, mingo(a)redhat.com, Steve.Bannister(a)arm.com,
> tj(a)kernel.org, linux-mmc(a)vger.kernel.org, Arvind.Chauhan(a)arm.com,
> pjt(a)google.com, linux-rt-users(a)vger.kernel.org,
> charles.garcia-tobin(a)arm.com
> Subject: Re: [PATCH V3 5/7] mmc: queue work on any cpu
> Message-ID:
> <CAKohpomqkYeMgvTpjrUjGs3md=vzguW5Ae6Snf4hzsqtQxNDOw(a)mail.gmail.com>
> Content-Type: text/plain; charset=windows-1252
>
> On 22 March 2013 22:56, Chris Ball <cjb(a)laptop.org> wrote:
>> On Mon, Mar 18 2013, Viresh Kumar wrote:
>>
>> /home/cjb/git/mmc/drivers/mmc/core/core.c: In function ?mmc_schedule_delayed_work?:
>> /home/cjb/git/mmc/drivers/mmc/core/core.c:88:2: error: implicit declaration of function ?queue_delayed_work_on_any_cpu? [-Werror=implicit-function-declaration]
>>
>> I've dropped this patch for now. This function doesn't seem to be
>> defined in linux-next either.
>
> Hi chris,
>
> This patch was part of a bigger patchset which also adds this API. I
> don't want you to
> apply this one but just Ack here. Probably Tejun or some scheduler
> maintainer will
> apply it later (if they like all patches).
>
>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 22 Mar 2013 13:09:13 -0400
> From: Chris Ball <cjb(a)laptop.org>
> To: Viresh Kumar <viresh.kumar(a)linaro.org>
> Cc: venki(a)google.com, linaro-kernel(a)lists.linaro.org,
> suresh.b.siddha(a)intel.com, peterz(a)infradead.org, Liviu.Dudau(a)arm.com,
> robin.randhawa(a)arm.com, linux-kernel(a)vger.kernel.org,
> rostedt(a)goodmis.org, mingo(a)redhat.com, Steve.Bannister(a)arm.com,
> tj(a)kernel.org, linux-mmc(a)vger.kernel.org, Arvind.Chauhan(a)arm.com,
> pjt(a)google.com, linux-rt-users(a)vger.kernel.org,
> charles.garcia-tobin(a)arm.com
> Subject: Re: [PATCH V3 5/7] mmc: queue work on any cpu
> Message-ID: <87sj3nl5ty.fsf(a)octavius.laptop.org>
> Content-Type: text/plain
>
> Hi,
>
> On Mon, Mar 18 2013, Viresh Kumar wrote:
>> mmc uses workqueues for running mmc_rescan(). There is no real dependency of
>> scheduling these on the cpu which scheduled them.
>>
>> On a idle system, it is observed that and idle cpu wakes up many times just to
>> service this work. It would be better if we can schedule it on a cpu which isn't
>> idle to save on power.
>>
>> By idle cpu (from scheduler's perspective) we mean:
>> - Current task is idle task
>> - nr_running == 0
>> - wake_list is empty
>>
>> This patch replaces the queue_delayed_work() with
>> queue_delayed_work_on_any_cpu() siblings.
>>
>> This routine would look for the closest (via scheduling domains) non-idle cpu
>> (non-idle from schedulers perspective). If the current cpu is not idle or all
>> cpus are idle, work will be scheduled on local cpu.
>>
>> Cc: Chris Ball <cjb(a)laptop.org>
>> Cc: linux-mmc(a)vger.kernel.org
>> Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
>> ---
>> drivers/mmc/core/core.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>> index 9290bb5..adf331a 100644
>> --- a/drivers/mmc/core/core.c
>> +++ b/drivers/mmc/core/core.c
>> @@ -85,7 +85,7 @@ MODULE_PARM_DESC(
>> static int mmc_schedule_delayed_work(struct delayed_work *work,
>> unsigned long delay)
>> {
>> - return queue_delayed_work(workqueue, work, delay);
>> + return queue_delayed_work_on_any_cpu(workqueue, work, delay);
>> }
>>
>> /*
>
> Thanks, pushed to mmc-next for 3.10.
>
> - Chris.
where can i find the kernel source for pandaboard es ?
=== Highlights ===
* Tixy reported issue with the linaro.android-3.9-experimental branch,
and I patched it up.
* Sent Tixy's issue along with other fixups I had for the
linaro.android-3.9 branch to AOSP via gerrit. Arve reviewed and picked
them up, so AOSP now matches our tree.
* Talked with Jakub about Jira card process
* Prepped and did interview #1 for the week, wrote up my summary afterwards
* Did a few iterations of review with Serban on his binder patches.
* Provided MikeL w/ Timezone coverage info (Linaro covers 71% of
timezones, missing only 7!)
* Reviewed blueprints and did weekly Android upstreaming subteam email.
* Prepped, interviewed and summarized interview #2 this week.
* Started reviewing and testing Minchan's latest vrange patches. Sent
some questions and a small cleanup fix his way.
* Took a second pass at NTP changes required for timekeeping lock hold
reductions & sent to Thomas.
=== Plans ===
* Still need to sort out Linaro Connect expense reporting.
* Probably more interviews, although hopefully not too much more, as
they're taking up a big chunk of my time.
* Send out timekeeping lock hold time reductions.
* Send pull request to tglx for my 3.10 queue.
* Still need to work on earlysuspend blog post.
* Focus on volatile range work in prep for lsf-mm
=== Issues ===
* NA
== Linus Walleij linusw ==
=== Highlights ===
* First activity report since Linaro Connect!
* Sent first batch of multiplatform patches after finally getting
a few cycles to actually fix this. After that Arnd saw that there
was not much left and completed the series. (Yay!)
We now have ux500 booting in multiplatform config, but the
exact format of the patches need to be discussed and
finalized. Some guys working on the PRCMU, CPUidle
and suspend drivers need to be coordinated.
* Collected pinctrl fixes and new patches.
* Started to churn through some of the GPIO review backlog.
* Initiated a discussion on PCI and VGA consoles, which really
took off. This is one of those open items, ARM PCI needs some
work for sure.
=== Plans ===
* A short paternity leave 6/5->9/5 in may.
* Finalize the multiplatform series, coordinate with stakeholders
and send a pull request.
* Convert Nomadik pinctrl driver to register GPIO ranges
from the gpiochip side.
* Test the PL08x patches on the Ericsson Research
PB11MPCore and submit platform data for using
pl08x DMA on that platform.
* Look into other Ux500 stuff in need of mainlining...
using an internal tracking sheet for this.
* Get hands dirty with regmap.
=== Issues ===
* Things have been hectic internally at ST-Ericsson diverting me
from Linaro work.
Thanks,
Linus Walleij
== Ulf Hansson ==
=== Highlights ===
Storage:
* Reviewing patches on mmc-list. Pushed a patch related to polling card detect.
* Discussing sent patchset to enable runtime pm support for mmc/sd block device.
* Rework parts of the HS200 and SDR104 support in the mmc protocol
layer. First part for tuning sequence done, patch will be pushed to
mmc-list shortly.
Clk:
* Preparing patchset for upstreaming patches that will add support for
abx500 clocks.
* Preparing patchset to update different driver's clk support used by ux500.
* Resent patches for clk_set_parent fixup.
* Patches disable unprepared clocks at late init merged by Mike.
* Diving into discussion around doing DVFS through the clock API.
Reviewing related patches.
=== Plans ===
Storage:
* Continue the work for the next item in the mmc power management blueprint.
* Push patches for mmci host driver, to support UHS cards/HS200, CMD23.
* Push patches for mmci host driver to extend the power management
support. Context save/restore are for example missing.
* Push patches for mmci host driver to add support for new STE 8540 variant.
Clk:
* Optimizations and bug fixes for ux500 clk implementations.
=== Issues ===
* None.
In of_dma_controller_register() routine we are calling of_get_property() as an
parameter to be32_to_cpup(). In case the property doesn't exist we will get a
crash.
This patch changes this code to check if we got a valid property first and then
runs be32_to_cpup() on it.
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
My mails are broken, i have pushed this patch here:
http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/h…
drivers/dma/of-dma.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/dma/of-dma.c b/drivers/dma/of-dma.c
index 69d04d2..09c7ad1 100644
--- a/drivers/dma/of-dma.c
+++ b/drivers/dma/of-dma.c
@@ -93,6 +93,7 @@ int of_dma_controller_register(struct device_node *np,
{
struct of_dma *ofdma;
int nbcells;
+ const __be32 *prop;
if (!np || !of_dma_xlate) {
pr_err("%s: not enough information provided\n", __func__);
@@ -103,8 +104,11 @@ int of_dma_controller_register(struct device_node *np,
if (!ofdma)
return -ENOMEM;
- nbcells = be32_to_cpup(of_get_property(np, "#dma-cells", NULL));
- if (!nbcells) {
+ prop = of_get_property(np, "#dma-cells", NULL);
+ if (prop)
+ nbcells = be32_to_cpup(prop);
+
+ if (!prop || !nbcells) {
pr_err("%s: #dma-cells property is missing or invalid\n",
__func__);
kfree(ofdma);
--
1.7.12.rc2.18.g61b472e
When a cpu enters a deep idle state, the local timers are stopped and
the time framework falls back to the timer device used as a broadcast
timer.
The different cpuidle drivers are calling clockevents_notify ENTER/EXIT
when the idle state stops the local timer.
The proposed patch introduces a new flag CPUIDLE_FLAG_TIMER_STOP to let
the cpuidle framework to call clockevents_notify instead of duplicating
again and again these lines in all the cpuidle drivers.
Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
---
drivers/cpuidle/cpuidle.c | 9 +++++++++
include/linux/cpuidle.h | 1 +
2 files changed, 10 insertions(+)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index eba6929..c500370 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -8,6 +8,7 @@
* This code is licenced under the GPL.
*/
+#include <linux/clockchips.h>
#include <linux/kernel.h>
#include <linux/mutex.h>
#include <linux/sched.h>
@@ -146,12 +147,20 @@ int cpuidle_idle_call(void)
trace_cpu_idle_rcuidle(next_state, dev->cpu);
+ if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP)
+ clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
+ &dev->cpu);
+
if (cpuidle_state_is_coupled(dev, drv, next_state))
entered_state = cpuidle_enter_state_coupled(dev, drv,
next_state);
else
entered_state = cpuidle_enter_state(dev, drv, next_state);
+ if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP)
+ clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT,
+ &dev->cpu);
+
trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
/* give the governor an opportunity to reflect on the outcome */
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 480c14d..a837b33 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -57,6 +57,7 @@ struct cpuidle_state {
/* Idle State Flags */
#define CPUIDLE_FLAG_TIME_VALID (0x01) /* is residency time measurable? */
#define CPUIDLE_FLAG_COUPLED (0x02) /* state applies to multiple cpus */
+#define CPUIDLE_FLAG_TIMER_STOP (0x04) /* timer is stopped on this state */
#define CPUIDLE_DRIVER_FLAGS_MASK (0xFFFF0000)
--
1.7.9.5
This patch series adds support for DRM FIMD DT for Exynos4 DT Machines,
specifically for Exynos4412 SoC.
changes since v7:
- rebased to kgene's "for-next"
- Migrated to Common Clock Framework
- removed the patch "ARM: dts: Add FIMD AUXDATA node entry for exynos4 DT",
as migration to Common Clock Framework will NOT need this.
- addressed the comments raised by Sachin Kamat <sachin.kamat(a)linaro.org>
changes since v6:
- addressed comments and added interrupt-names = "fifo", "vsync", "lcd_sys"
in exynos4.dtsi and re-ordered the interrupt numbering to match the order in
interrupt combiner IP as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>.
changes since v5:
- renamed the fimd binding documentation file name as "samsung-fimd.txt",
since it not only talks about exynos display controller but also about
previous samsung display controllers.
- rephrased an abmigious statement about the interrupt combiner in the
fimd binding documentation as pointed out by
Sachin Kamat <sachin.kamat(a)linaro.org>
changes since v4:
- moved the fimd binding documentation to Documentation/devicetree/bindings/video/
as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>
- added more fimd compatiblity strings in fimd documentation as
discussed at https://patchwork.kernel.org/patch/2144861/ with
Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com> and
Tomasz Figa <tomasz.figa(a)gmail.com>
- modified compatible string for exynos4 fimd as "exynos4210-fimd"
exynos5 fimd as "exynos5250-fimd" to stick to the rule that compatible
value should be named after first specific SoC model in which this
particular IP version was included as discussed at
https://patchwork.kernel.org/patch/2144861/
- documented more about the interrupt combiner and their order as
suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>
changes since v3:
- rebased on
http://git.kernel.org/?p=linux/kernel/git/kgene/linux-samsung.git;a=shortlo…
changes since v2:
- added alias to 'fimd@11c00000' node
(reported by: Rahul Sharma <r.sh.open(a)gmail.com>)
- removed 'lcd0_data' node as there was already a similar node lcd_data24
(reported by: Jingoo Han <jg1.han(a)samsung.com>
- replaced spaces with tabs in display-timing node
changes since v1:
- added new patch to add FIMD DT binding Documentation
- removed patch enabling SAMSUNG_DEV_BACKLIGHT and SAMSUNG_DEV_PMW
for mach-exynos4 DT
- added 'status' property to fimd DT node
Is based on branch kgene's "for-next"
https://git.kernel.org/cgit/linux/kernel/git/kgene/linux-samsung.git/log/?h…
Sachin Kamat (1):
ARM: dts: Add lcd pinctrl node entries for EXYNOS4412 SoC
Vikas Sajjan (3):
ARM: dts: Add FIMD node to exynos4
ARM: dts: Add FIMD node and display timing node to
exynos4412-origen.dts
ARM: dts: Add FIMD DT binding Documentation
.../devicetree/bindings/video/samsung-fimd.txt | 61 ++++++++++++++++++++
arch/arm/boot/dts/exynos4.dtsi | 11 ++++
arch/arm/boot/dts/exynos4412-origen.dts | 22 +++++++
arch/arm/boot/dts/exynos4x12-pinctrl.dtsi | 14 +++++
4 files changed, 108 insertions(+)
create mode 100644 Documentation/devicetree/bindings/video/samsung-fimd.txt
--
1.7.9.5
The FIMD driver expects the "vsync" interrupt to be mentioned as the 1st
parameter in the FIMD DT node. So to meet this expectation of the driver,
the FIMD DT node was forced to be made by keeping "vsync" as the 1st
parameter.
this resolves the above mentioned "hack" by introducing
"interrupt-names", so that FIMD driver can get the interrupt resource by
name as discussed at
http://www.mail-archive.com/linux-samsung-soc@vger.kernel.org/msg16211.html
patch is dependent on https://patchwork.kernel.org/patch/2184981/
Signed-off-by: Vikas Sajjan <vikas.sajjan(a)linaro.org>
---
arch/arm/boot/dts/exynos5250.dtsi | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/exynos5250.dtsi b/arch/arm/boot/dts/exynos5250.dtsi
index 0ee4706..76c8911 100644
--- a/arch/arm/boot/dts/exynos5250.dtsi
+++ b/arch/arm/boot/dts/exynos5250.dtsi
@@ -588,6 +588,7 @@
compatible = "samsung,exynos5-fimd";
interrupt-parent = <&combiner>;
reg = <0x14400000 0x40000>;
- interrupts = <18 5>, <18 4>, <18 6>;
+ interrupt-names = "fifo", "vsync", "lcd_sys";
+ interrupts = <18 4>, <18 5>, <18 6>;
};
};
--
1.7.9.5
When the CPU_IDLE and the ARCH_KIRKWOOD options are set it is
pointless to define a new option CPU_IDLE_KIRKWOOD because it
is redundant.
The Makefile drivers directory contains a condition to compile
the cpuidle drivers:
obj-$(CONFIG_CPU_IDLE) += cpuidle/
Hence, if CPU_IDLE is not set we won't enter this directory.
This patch removes the useless Kconfig option and replaces the
condition in the Makefile by CONFIG_ARCH_KIRKWOOD.
Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
---
arch/arm/configs/kirkwood_defconfig | 1 -
drivers/cpuidle/Kconfig | 6 ------
drivers/cpuidle/Makefile | 2 +-
3 files changed, 1 insertion(+), 8 deletions(-)
diff --git a/arch/arm/configs/kirkwood_defconfig b/arch/arm/configs/kirkwood_defconfig
index 13482ea..93f3794 100644
--- a/arch/arm/configs/kirkwood_defconfig
+++ b/arch/arm/configs/kirkwood_defconfig
@@ -56,7 +56,6 @@ CONFIG_AEABI=y
CONFIG_ZBOOT_ROM_TEXT=0x0
CONFIG_ZBOOT_ROM_BSS=0x0
CONFIG_CPU_IDLE=y
-CONFIG_CPU_IDLE_KIRKWOOD=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
index 071e2c3..c4cc27e 100644
--- a/drivers/cpuidle/Kconfig
+++ b/drivers/cpuidle/Kconfig
@@ -39,10 +39,4 @@ config CPU_IDLE_CALXEDA
help
Select this to enable cpuidle on Calxeda processors.
-config CPU_IDLE_KIRKWOOD
- bool "CPU Idle Driver for Kirkwood processors"
- depends on ARCH_KIRKWOOD
- help
- Select this to enable cpuidle on Kirkwood processors.
-
endif
diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
index 24c6e7d..0d8bd55 100644
--- a/drivers/cpuidle/Makefile
+++ b/drivers/cpuidle/Makefile
@@ -6,4 +6,4 @@ obj-y += cpuidle.o driver.o governor.o sysfs.o governors/
obj-$(CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED) += coupled.o
obj-$(CONFIG_CPU_IDLE_CALXEDA) += cpuidle-calxeda.o
-obj-$(CONFIG_CPU_IDLE_KIRKWOOD) += cpuidle-kirkwood.o
+obj-$(CONFIG_ARCH_KIRKWOOD) += cpuidle-kirkwood.o
--
1.7.9.5
The bash thread can wake up on a different CPU depending of which CPU
has initiated the wake up. (CPU7 or CPU10 on your screenshoot). What
is a bit less normal is why your expr tasks migrate in the middle of
their execution.
On 18 March 2013 20:07, Bruce Dawson <bruced(a)valvesoftware.com> wrote:
> BTW, I just uploaded a screenshot of the shell script running. You can see it here:
>
> http://www.cygnus-software.com/images/ZoomScreenshot_croppedbig.png
>
> It was made using http://www.rotateright.com/zoom/. The red blocks are the bash process, the other ones are various invocations of expr.
>
> -----Original Message-----
> From: cpufreq-owner(a)vger.kernel.org [mailto:cpufreq-owner@vger.kernel.org] On Behalf Of Bruce Dawson
> Sent: Monday, March 18, 2013 10:01 AM
> To: 'Vincent Guittot'; Viresh Kumar
> Cc: Dave Jones; cpufreq(a)vger.kernel.org; Rafael J. Wysocki; linaro-kernel(a)lists.linaro.org
> Subject: RE: CPU power management bug -- CPU bound task fails to raise CPU frequency
>
> I guess that makes sense for the scheduler to look for the idlest CPU in the system. That's good to know.
>
> I had guessed that the scheduler would do something like that and that my test would run on two cores. However I find that bash alternates between two cores, which seems odd. Additionally, each invocation of expr starts on one core and moves to another, which seems odd that each invocation lives for a ms or less. The net effect is that six or more different cores get involved.
>
> Anyway, it is a pathological case, so maybe it doesn't matter, but given the popularity of $(command) in shell scripts it may not be completely irrelevant either.
>
> -----Original Message-----
> From: Vincent Guittot [mailto:vincent.guittot@linaro.org]
> Sent: Monday, March 18, 2013 4:07 AM
> To: Viresh Kumar
> Cc: Bruce Dawson; Dave Jones; cpufreq(a)vger.kernel.org; Rafael J. Wysocki; linaro-kernel(a)lists.linaro.org
> Subject: Re: CPU power management bug -- CPU bound task fails to raise CPU frequency
>
> On 18 March 2013 06:04, Viresh Kumar <viresh.kumar(a)linaro.org> wrote:
>> Let me get in the scheduler expert from Linaro (Vincent Guittot, would
>> be available after few hours)
>>
>> Vincent, please start reading from this mail:
>>
>> http://permalink.gmane.org/gmane.linux.kernel.cpufreq/9675
>>
>> Now, we want to understand how to make this task perform better as
>> scheduler is using multiple cpus for it and hence all are staying at
>> low freqs, as load isn't enough..
>
> Hi,
>
> Your 1st test creates a task to evaluate each expr, and the fork sequence of the scheduler looks for the idlest CPU in the system.
> That's explain why your test is evenly spread on all CPUs and the average load of each CPU is below the threshold of cpufreq At the opposite, your 2nd test uses only one task which stays on one CPU and trig the frequency increase.
>
> I would say that the scheduler behavior is almost normal : spread to get best performance (even if in this use case, the threads run
> sequentially) but you have this side effect on the cpufreq thats sees each core individually as not loaded. This example tends to push in favor of a better cooperation between scheduler and cpufreq for sharing statistics
>
> Vincent
>
>>
>> --
>> viresh
>>
>> On 18 March 2013 10:28, Bruce Dawson <bruced(a)valvesoftware.com> wrote:
>>> This is with the Ondemand governor.
>>>
>>> The more I ponder this the more I think that the real issue is not the frequency drivers, but the kernel scheduler. The shell script involves two processes being alive at any given time, and one process running (since bash always waits for expr to finish). Therefore the entire task should run on either one core or on two. Instead I see (from looking at thread scheduling graphed using the Zoom profiler -- http://www.rotateright.com/) that it runs on six different cores. bash alternates between two cores, and each invocation of expr is started on one core and then moves to another. Given that it seems not surprising that the CPU frequency management doesn't trigger.
>>>
>>>> So, the frequency might not be increased if there are multiple cpus
>>>> running for a specific task and none of them has high enough load at
>>>> that time
>>>
>>> Yep, that's what I figured. Each cpu's load is quite low -- 20% or lower -- because the work is so spread out.
>>>
>>> If I run the entire thing under "taskset 1" then everything runs on one core, the frequency elevation happens, and the entire task runs roughly three times faster.
>>>
>>> Crazy/cool.
>>>
>>> -----Original Message-----
>>> From: viresh.linux(a)gmail.com [mailto:viresh.linux@gmail.com] On
>>> Behalf Of Viresh Kumar
>>> Sent: Sunday, March 17, 2013 9:32 PM
>>> To: Bruce Dawson
>>> Cc: Dave Jones; cpufreq(a)vger.kernel.org; Rafael J. Wysocki;
>>> linaro-kernel(a)lists.linaro.org
>>> Subject: Re: CPU power management bug -- CPU bound task fails to
>>> raise CPU frequency
>>>
>>> On Sun, Mar 17, 2013 at 6:37 AM, Bruce Dawson <bruced(a)valvesoftware.com> wrote:
>>>> Dave/others, I've come up with a simple (and real) scenario where a CPU bound task running on Ubuntu (and presumably other Linux flavors) fails to be detected as CPU bound by the Linux kernel, meaning that the CPU continues to run at low speed, meaning that this CPU bound task takes (on my machines) about three times longer to run than it should.
>>>>
>>>> I found these e-mail addresses in the MAINTAINERS list under CPU FREQUENCY DRIVERS which I'm hoping is the correct area.
>>>
>>> Yes, cpufreq mailing list is the right list for this.
>>>
>>>> The basic problem is that on a multi-core system if you run a shell script that spawns lots of sub processes then the workload ends up distributed across all of the CPUs. Therefore, since none of the CPUs are particularly busy, the Linux kernel doesn't realize that a CPU bound task is running, so it leaves the CPU frequency set to low. I have confirmed the behavior in multiple ways. Specifically, I have used "iostat 1" and "mpstat -P ALL 1" to confirm that a full core's worth of CPU work is being done. mpstat also showed that the work was distributed across multiple cores. Using the zoom profiler UI for perf showed the sub processes (and bash) being spread across multiple cores, and perf stat showed that the CPU frequency was staying low even though the task was CPU bound.
>>>
>>> There are few things which would be helpful to understand what's going on.
>>> What governor is used in your case? Probably Ondemand (My ubuntu uses this).
>>>
>>> Ideally, cpu frequency is increased only if cpu load is very high (or
>>> above threshold,
>>> 95 in my ubuntu). So, the frequency might not be increased if there are multiple cpus running for a specific task and none of them has high enough load at that time.
>>>
>>> Other stuff that i suspect here is a bug which was solved recently by
>>> below patch. If
>>> policy->cpu (that might be cpu 0 for you) is sleeping, then load is
>>> never evaluated even
>>> if all other cpus are very busy. If you can try below patch then it might be helpful. BTW, you might not be able to apply it easily as it has got lots of dependencies.. and so you might need to pick all drivers/cpufreq patches from v3.9-rc1.
>>>
>>> commit 2abfa876f1117b0ab45f191fb1f82c41b1cbc8fe
>>> Author: Rickard Andersson <rickard.andersson(a)stericsson.com>
>>> Date: Thu Dec 27 14:55:38 2012 +0000
>>>
>>> cpufreq: handle SW coordinated CPUs
>>>
>>> This patch fixes a bug that occurred when we had load on a secondary CPU
>>> and the primary CPU was sleeping. Only one sampling timer was spawned
>>> and it was spawned as a deferred timer on the primary CPU, so when a
>>> secondary CPU had a change in load this was not detected by the cpufreq
>>> governor (both ondemand and conservative).
>>>
>>> This patch make sure that deferred timers are run on all CPUs in the
>>> case of software controlled CPUs that run on the same frequency.
>>>
>>> Signed-off-by: Rickard Andersson <rickard.andersson(a)stericsson.com>
>>> Signed-off-by: Fabio Baltieri <fabio.baltieri(a)linaro.org>
>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
>>> ---
>>> drivers/cpufreq/cpufreq_conservative.c | 3 ++-
>>> drivers/cpufreq/cpufreq_governor.c | 44
>>> ++++++++++++++++++++++++++++++++++++++------
>>> drivers/cpufreq/cpufreq_governor.h | 1 +
>>> drivers/cpufreq/cpufreq_ondemand.c | 3 ++-
>>> 4 files changed, 43 insertions(+), 8 deletions(-)
>>>
>>>
>>>> I have only reproed this behavior on six-core/twelve-thread systems. I would assume that at least a two-core system would be needed to repro this bug, and perhaps more. The bug will not repro if the system is not relatively idle, since a background CPU hog will force the frequency up.
>>>>
>>>> The repro is exquisitely simple -- ExprCount() is a simplified version of the repro (portable looping in a shell script) and BashCount() is an optimized and less portable version that runs far faster and also avoids this power management problem -- the CPU frequency is raised appropriately. Running a busy loop in another process is another way to get the frequency up and this makes ExprCount() run ~3x faster. Here is the script:
>>>>
>>>> --------------------------------------
>>>> #!/bin/bash
>>>> function ExprCount() {
>>>> i=$1
>>>> while [ $i -gt 0 ]; do
>>>> i=$(expr $i - 1)
>>>
>>> I may be wrong but one cpu is used to run this script and other one would be used to run expr program.. So, 2 cpus should be good enough to reproduce this setup.
>>>
>>> BTW, i have tried your scripts and was able to reproduce the setup
>>> here on a 2 cpu
>>> 4 thread system.
>>>
>>> --
>>> viresh
> --
> To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
While migrating to common clock framework (CCF), found that the FIMD clocks
were pulled down by the CCF.
If CCF finds any clock(s) which has NOT been claimed by any of the
drivers, then such clock(s) are PULLed low by CCF.
By calling clk_prepare_enable() for FIMD clocks fixes the issue.
Signed-off-by: Vikas Sajjan <vikas.sajjan(a)linaro.org>
---
drivers/gpu/drm/exynos/exynos_drm_fimd.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
index 9537761..d93dd8a 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
@@ -934,6 +934,9 @@ static int fimd_probe(struct platform_device *pdev)
return ret;
}
+ clk_prepare_enable(ctx->lcd_clk);
+ clk_prepare_enable(ctx->bus_clk);
+
ctx->vidcon0 = pdata->vidcon0;
ctx->vidcon1 = pdata->vidcon1;
ctx->default_win = pdata->default_win;
--
1.7.9.5
When a cpu goes to a deep idle state where its local timer is shutdown,
it notifies the time framework to use the broadcast timer instead.
Unfortunately, the broadcast device could wake up any CPU, including an
idle one which is not concerned by the wake up at all.
This implies, in the worst case, an idle CPU will wake up to send an IPI
to another idle cpu.
This patch solves this by setting the irq affinity to the cpu concerned
by the nearest timer event, by this way, the CPU which is wake up is
guarantee to be the one concerned by the next event and we are safe with
unnecessary wakeup for another idle CPU.
As the irq affinity is not supported by all the archs, a flag is needed
to specify which clocksource can handle it : CLOCK_EVT_FEAT_DYNIRQ.
Tested on a u8500 board with a test program doing indefinitely usleep 10000
wired on each CPU.
With dynamic irq affinity:
Log is 10.042298 secs long with 4190 events
cpu0/state0, 24 hits, total 2718.00us, avg 113.25us, min 0.00us, max 854.00us
cpu0/state1, 994 hits, total 9874827.00us, avg 9934.43us, min 30.00us, max 10346.00us
cpu1/state0, 73 hits, total 17001.00us, avg 232.89us, min 0.00us, max 10040.00us
cpu1/state1, 1002 hits, total 9883840.00us, avg 9864.11us, min 0.00us, max 10742.00us
cluster/state0, 0 hits, total 0.00us, avg 0.00us, min 0.00us, max 0.00us
cluster/state1, 1931 hits, total 9762328.00us, avg 5055.58us, min 30.00us, max 9308.00us
Without dynamic irq affinity:
Log is 10.036834 secs long with 6574 events
cpu0/state0, 114 hits, total 20107.00us, avg 176.38us, min 0.00us, max 7233.00us
cpu0/state1, 1951 hits, total 9833836.00us, avg 5040.41us, min 0.00us, max 9217.00us
cpu1/state0, 223 hits, total 21140.00us, avg 94.80us, min 0.00us, max 2960.00us
cpu1/state1, 997 hits, total 9879748.00us, avg 9909.48us, min 0.00us, max 10346.00us
cluster/state0, 5 hits, total 5462.00us, avg 1092.40us, min 580.00us, max 2899.00us
cluster/state1, 2298 hits, total 9740988.00us, avg 4238.90us, min 30.00us, max 9217.00us
Results for the specific test case 'usleep 10000'
* reduced by 40% the number of wake up on the system
* reduced by 49% the number of wake up for CPU0
* increased by factor two idle time for CPU0
* increase by 16% package idle hits + 16% average package idle time
Changelog:
V2 :
* mentioned CLOCK_EVT_FEAT_DYNIRQ flag name in patch description
* added comments for CLOCK_EVT_FEAT_DYNIRQ
* replaced tick_broadcast_set_affinity parameter to use a cpumask
V1 : initial post
Daniel Lezcano (3):
time : pass broadcast parameter
time : set broadcast irq affinity
ARM: nomadik: add dynamic irq flag to the timer
Viresh Kumar (1):
ARM: timer-sp: Set dynamic irq affinity
arch/arm/common/timer-sp.c | 3 ++-
drivers/clocksource/nomadik-mtu.c | 3 ++-
include/linux/clockchips.h | 5 +++++
kernel/time/tick-broadcast.c | 41 +++++++++++++++++++++++++++++--------
4 files changed, 42 insertions(+), 10 deletions(-)
--
1.7.9.5
On my smp platform which is made of 5 cores in 2 clusters, I have the
nr_busy_cpu field of sched_group_power struct that is not null when the
platform is fully idle. The root cause is:
During the boot sequence, some CPUs reach the idle loop and set their
NOHZ_IDLE flag while waiting for others CPUs to boot. But the nr_busy_cpus
field is initialized later with the assumption that all CPUs are in the busy
state whereas some CPUs have already set their NOHZ_IDLE flag.
More generally, the NOHZ_IDLE flag must be initialized when new sched_domains
are created in order to ensure that NOHZ_IDLE and nr_busy_cpus are aligned.
This condition can be ensured by adding a synchronize_rcu between the
destruction of old sched_domains and the creation of new ones so the NOHZ_IDLE
flag will not be updated with old sched_domain once it has been initialized.
But this solution introduces a additionnal latency in the rebuild sequence
that is called during cpu hotplug.
As suggested by Frederic Weisbecker, another solution is to have the same
rcu lifecycle for both NOHZ_IDLE and sched_domain struct. I have introduce
a new sched_domain_rq struct that is the entry point for both sched_domains
and objects that must follow the same lifecycle like NOHZ_IDLE flags. They
will share the same RCU lifecycle and will be always synchronized.
The synchronization is done at the cost of :
- an additional indirection for accessing the first sched_domain level
- an additional indirection and a rcu_dereference before accessing to the
NOHZ_IDLE flag.
Change since v4:
- link both sched_domain and NOHZ_IDLE flag in one RCU object so
their states are always synchronized.
Change since V3;
- NOHZ flag is not cleared if a NULL domain is attached to the CPU
- Remove patch 2/2 which becomes useless with latest modifications
Change since V2:
- change the initialization to idle state instead of busy state so a CPU that
enters idle during the build of the sched_domain will not corrupt the
initialization state
Change since V1:
- remove the patch for SCHED softirq on an idle core use case as it was
a side effect of the other use cases.
Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org>
---
include/linux/sched.h | 6 +++
kernel/sched/core.c | 105 ++++++++++++++++++++++++++++++++++++++++++++-----
kernel/sched/fair.c | 35 +++++++++++------
kernel/sched/sched.h | 24 +++++++++--
4 files changed, 145 insertions(+), 25 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index d35d2b6..2a52188 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -959,6 +959,12 @@ struct sched_domain {
unsigned long span[0];
};
+struct sched_domain_rq {
+ struct sched_domain *sd;
+ unsigned long flags;
+ struct rcu_head rcu; /* used during destruction */
+};
+
static inline struct cpumask *sched_domain_span(struct sched_domain *sd)
{
return to_cpumask(sd->span);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7f12624..69e2313 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5602,6 +5602,15 @@ static void destroy_sched_domains(struct sched_domain *sd, int cpu)
destroy_sched_domain(sd, cpu);
}
+static void destroy_sched_domain_rq(struct sched_domain_rq *sd_rq, int cpu)
+{
+ if (!sd_rq)
+ return;
+
+ destroy_sched_domains(sd_rq->sd, cpu);
+ kfree_rcu(sd_rq, rcu);
+}
+
/*
* Keep a special pointer to the highest sched_domain that has
* SD_SHARE_PKG_RESOURCE set (Last Level Cache Domain) for this
@@ -5632,10 +5641,23 @@ static void update_top_cache_domain(int cpu)
* hold the hotplug lock.
*/
static void
-cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
+cpu_attach_domain(struct sched_domain_rq *sd_rq, struct root_domain *rd,
+ int cpu)
{
struct rq *rq = cpu_rq(cpu);
- struct sched_domain *tmp;
+ struct sched_domain_rq *tmp_rq;
+ struct sched_domain *tmp, *sd = NULL;
+
+ /*
+ * If we don't have any sched_domain and associated object, we can
+ * directly jump to the attach sequence otherwise we try to degenerate
+ * the sched_domain
+ */
+ if (!sd_rq)
+ goto attach;
+
+ /* Get a pointer to the 1st sched_domain */
+ sd = sd_rq->sd;
/* Remove the sched domains which do not contribute to scheduling. */
for (tmp = sd; tmp; ) {
@@ -5658,14 +5680,17 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
destroy_sched_domain(tmp, cpu);
if (sd)
sd->child = NULL;
+ /* update sched_domain_rq */
+ sd_rq->sd = sd;
}
+attach:
sched_domain_debug(sd, cpu);
rq_attach_root(rq, rd);
- tmp = rq->sd;
- rcu_assign_pointer(rq->sd, sd);
- destroy_sched_domains(tmp, cpu);
+ tmp_rq = rq->sd_rq;
+ rcu_assign_pointer(rq->sd_rq, sd_rq);
+ destroy_sched_domain_rq(tmp_rq, cpu);
update_top_cache_domain(cpu);
}
@@ -5695,12 +5720,14 @@ struct sd_data {
};
struct s_data {
+ struct sched_domain_rq ** __percpu sd_rq;
struct sched_domain ** __percpu sd;
struct root_domain *rd;
};
enum s_alloc {
sa_rootdomain,
+ sa_sd_rq,
sa_sd,
sa_sd_storage,
sa_none,
@@ -5935,7 +5962,7 @@ static void init_sched_groups_power(int cpu, struct sched_domain *sd)
return;
update_group_power(sd, cpu);
- atomic_set(&sg->sgp->nr_busy_cpus, sg->group_weight);
+ atomic_set(&sg->sgp->nr_busy_cpus, 0);
}
int __weak arch_sd_sibling_asym_packing(void)
@@ -6011,6 +6038,8 @@ static void set_domain_attribute(struct sched_domain *sd,
static void __sdt_free(const struct cpumask *cpu_map);
static int __sdt_alloc(const struct cpumask *cpu_map);
+static void __sdrq_free(const struct cpumask *cpu_map, struct s_data *d);
+static int __sdrq_alloc(const struct cpumask *cpu_map, struct s_data *d);
static void __free_domain_allocs(struct s_data *d, enum s_alloc what,
const struct cpumask *cpu_map)
@@ -6019,6 +6048,9 @@ static void __free_domain_allocs(struct s_data *d, enum s_alloc what,
case sa_rootdomain:
if (!atomic_read(&d->rd->refcount))
free_rootdomain(&d->rd->rcu); /* fall through */
+ case sa_sd_rq:
+ __sdrq_free(cpu_map, d); /* fall through */
+ free_percpu(d->sd_rq); /* fall through */
case sa_sd:
free_percpu(d->sd); /* fall through */
case sa_sd_storage:
@@ -6038,9 +6070,14 @@ static enum s_alloc __visit_domain_allocation_hell(struct s_data *d,
d->sd = alloc_percpu(struct sched_domain *);
if (!d->sd)
return sa_sd_storage;
+ d->sd_rq = alloc_percpu(struct sched_domain_rq *);
+ if (!d->sd_rq)
+ return sa_sd;
+ if (__sdrq_alloc(cpu_map, d))
+ return sa_sd_rq;
d->rd = alloc_rootdomain();
if (!d->rd)
- return sa_sd;
+ return sa_sd_rq;
return sa_rootdomain;
}
@@ -6466,6 +6503,46 @@ static void __sdt_free(const struct cpumask *cpu_map)
}
}
+static int __sdrq_alloc(const struct cpumask *cpu_map, struct s_data *d)
+{
+ int j;
+
+ for_each_cpu(j, cpu_map) {
+ struct sched_domain_rq *sd_rq;
+
+ sd_rq = kzalloc_node(sizeof(struct sched_domain_rq),
+ GFP_KERNEL, cpu_to_node(j));
+ if (!sd_rq)
+ return -ENOMEM;
+
+ *per_cpu_ptr(d->sd_rq, j) = sd_rq;
+ }
+
+ return 0;
+}
+
+static void __sdrq_free(const struct cpumask *cpu_map, struct s_data *d)
+{
+ int j;
+
+ for_each_cpu(j, cpu_map)
+ if (*per_cpu_ptr(d->sd_rq, j))
+ kfree(*per_cpu_ptr(d->sd_rq, j));
+}
+
+static void build_sched_domain_rq(struct s_data *d, int cpu)
+{
+ struct sched_domain_rq *sd_rq;
+ struct sched_domain *sd;
+
+ /* Attach sched_domain to sched_domain_rq */
+ sd = *per_cpu_ptr(d->sd, cpu);
+ sd_rq = *per_cpu_ptr(d->sd_rq, cpu);
+ sd_rq->sd = sd;
+ /* Init flags */
+ set_bit(NOHZ_IDLE, sched_rq_flags(sd_rq));
+}
+
struct sched_domain *build_sched_domain(struct sched_domain_topology_level *tl,
struct s_data *d, const struct cpumask *cpu_map,
struct sched_domain_attr *attr, struct sched_domain *child,
@@ -6495,6 +6572,7 @@ static int build_sched_domains(const struct cpumask *cpu_map,
struct sched_domain_attr *attr)
{
enum s_alloc alloc_state = sa_none;
+ struct sched_domain_rq *sd_rq;
struct sched_domain *sd;
struct s_data d;
int i, ret = -ENOMEM;
@@ -6547,11 +6625,18 @@ static int build_sched_domains(const struct cpumask *cpu_map,
}
}
+ /* Init objects that must follow the sched_domain lifecycle */
+ for_each_cpu(i, cpu_map) {
+ build_sched_domain_rq(&d, i);
+ }
+
/* Attach the domains */
rcu_read_lock();
for_each_cpu(i, cpu_map) {
- sd = *per_cpu_ptr(d.sd, i);
- cpu_attach_domain(sd, d.rd, i);
+ sd_rq = *per_cpu_ptr(d.sd_rq, i);
+ cpu_attach_domain(sd_rq, d.rd, i);
+ /* claim allocation of sched_domain_rq object */
+ *per_cpu_ptr(d.sd_rq, i) = NULL;
}
rcu_read_unlock();
@@ -6982,7 +7067,7 @@ void __init sched_init(void)
rq->last_load_update_tick = jiffies;
#ifdef CONFIG_SMP
- rq->sd = NULL;
+ rq->sd_rq = NULL;
rq->rd = NULL;
rq->cpu_power = SCHED_POWER_SCALE;
rq->post_schedule = 0;
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7a33e59..1c7447e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5392,31 +5392,39 @@ static inline void nohz_balance_exit_idle(int cpu)
static inline void set_cpu_sd_state_busy(void)
{
+ struct sched_domain_rq *sd_rq;
struct sched_domain *sd;
int cpu = smp_processor_id();
- if (!test_bit(NOHZ_IDLE, nohz_flags(cpu)))
- return;
- clear_bit(NOHZ_IDLE, nohz_flags(cpu));
-
rcu_read_lock();
- for_each_domain(cpu, sd)
+ sd_rq = get_sched_domain_rq(cpu);
+
+ if (!sd_rq || !test_bit(NOHZ_IDLE, sched_rq_flags(sd_rq)))
+ goto unlock;
+ clear_bit(NOHZ_IDLE, sched_rq_flags(sd_rq));
+
+ for_each_domain_from_rq(sd_rq, sd)
atomic_inc(&sd->groups->sgp->nr_busy_cpus);
+unlock:
rcu_read_unlock();
}
void set_cpu_sd_state_idle(void)
{
+ struct sched_domain_rq *sd_rq;
struct sched_domain *sd;
int cpu = smp_processor_id();
- if (test_bit(NOHZ_IDLE, nohz_flags(cpu)))
- return;
- set_bit(NOHZ_IDLE, nohz_flags(cpu));
-
rcu_read_lock();
- for_each_domain(cpu, sd)
+ sd_rq = get_sched_domain_rq(cpu);
+
+ if (!sd_rq || test_bit(NOHZ_IDLE, sched_rq_flags(sd_rq)))
+ goto unlock;
+ set_bit(NOHZ_IDLE, sched_rq_flags(sd_rq));
+
+ for_each_domain_from_rq(sd_rq, sd)
atomic_dec(&sd->groups->sgp->nr_busy_cpus);
+unlock:
rcu_read_unlock();
}
@@ -5673,7 +5681,12 @@ static void run_rebalance_domains(struct softirq_action *h)
static inline int on_null_domain(int cpu)
{
- return !rcu_dereference_sched(cpu_rq(cpu)->sd);
+ struct sched_domain_rq *sd_rq =
+ rcu_dereference_sched(cpu_rq(cpu)->sd_rq);
+ struct sched_domain *sd = NULL;
+ if (sd_rq)
+ sd = sd_rq->sd;
+ return !sd;
}
/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index cc03cfd..f589306 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -417,7 +417,7 @@ struct rq {
#ifdef CONFIG_SMP
struct root_domain *rd;
- struct sched_domain *sd;
+ struct sched_domain_rq *sd_rq;
unsigned long cpu_power;
@@ -505,21 +505,37 @@ DECLARE_PER_CPU(struct rq, runqueues);
#ifdef CONFIG_SMP
-#define rcu_dereference_check_sched_domain(p) \
+#define rcu_dereference_check_sched_domain_rq(p) \
rcu_dereference_check((p), \
lockdep_is_held(&sched_domains_mutex))
+#define get_sched_domain_rq(cpu) \
+ rcu_dereference_check_sched_domain_rq(cpu_rq(cpu)->sd_rq)
+
+#define rcu_dereference_check_sched_domain(cpu) ({ \
+ struct sched_domain_rq *__sd_rq = get_sched_domain_rq(cpu); \
+ struct sched_domain *__sd = NULL; \
+ if (__sd_rq) \
+ __sd = __sd_rq->sd; \
+ __sd; \
+})
+
+#define sched_rq_flags(sd_rq) (&sd_rq->flags)
+
/*
- * The domain tree (rq->sd) is protected by RCU's quiescent state transition.
+ * The domain tree (rq->sd_rq) is protected by RCU's quiescent state transition.
* See detach_destroy_domains: synchronize_sched for details.
*
* The domain tree of any CPU may only be accessed from within
* preempt-disabled sections.
*/
#define for_each_domain(cpu, __sd) \
- for (__sd = rcu_dereference_check_sched_domain(cpu_rq(cpu)->sd); \
+ for (__sd = rcu_dereference_check_sched_domain(cpu); \
__sd; __sd = __sd->parent)
+#define for_each_domain_from_rq(sd_rq, __sd) \
+ for (__sd = sd_rq->sd; __sd; __sd = __sd->parent)
+
#define for_each_lower_domain(sd) for (; sd; sd = sd->child)
/**
--
1.7.9.5
Guenter,
Please check this v4 patches, thanks for all your reviews/comments for this
patch set.
Anton Vorontsov,
I have add your Acked-by: into patch [1/3]; but for this v4 [2/3], we have to
change it much according to Guenter's feedback and our internal discussions,
so please have a look at this new [2/3] again, thank you.
v3 -> v4 changes:
for patch [3/3]
- define delays in HZ
- update ab8500_read_sensor function, returning temp by parameter
- remove ab8500_is_visible function
- use clamp_val in set_min and set_max callback
- remove unnecessary locks in remove and suspend functions
- let abx500 and ab8500 use its own data structure
for patch [2/3]
- move the data tables from driver/power/ab8500_bmdata.c to
include/linux/power/ab8500.h
- rename driver/power/ab8500_bmdata.c to driver/power/ab8500_bm.c
- rename these variable names to eliminate CamelCase warnings
- add const attribute to these data
v2 -> v3 changes:
- Add interface for converting voltage to temperature
- Remove temp5 sensor since we cannot offer temperature read interface of it
- Update hyst to use absolute temperature instead of a difference
- Add the 3/3 patch
v1 -> v2 changes:
- Add Documentation/hwmon/abx500 and Documentation/hwmon/abx500
- Make devices which cannot report milli-Celsius invisible
- Add temp5_crit interface
- Re-work the old find_active_thresholds() to threshold_updated()
- Reset updated_min_alarm and updated_max_alarm at the end of each loop
- Update the hyst mechamisn to make it works as real hyst
- Remove non-stand attributes
- Re-order the operations sequence inside probe and remove functions
- Update all the lock usages to eliminate race conditions
- Make attibutes index starts from 0
also changes:
- Since the old [1/2] "ARM: ux500: rename ab8500 to abx500 for hwmon driver"
has been merged by Samuel, so won't send it again.
- Add another new patch "ab8500_btemp: export two symblols" as [2/2] of this
patch set.
Hongbo Zhang (3):
ab8500_btemp: make ab8500_btemp_get* interfaces public
ab8500: re-arrange ab8500 power and temperature data tables
hwmon: add ST-Ericsson ABX500 hwmon driver
Documentation/hwmon/ab8500 | 22 ++
Documentation/hwmon/abx500 | 28 ++
drivers/hwmon/Kconfig | 13 +
drivers/hwmon/Makefile | 1 +
drivers/hwmon/ab8500.c | 208 ++++++++++++++
drivers/hwmon/abx500.c | 494 +++++++++++++++++++++++++++++++++
drivers/hwmon/abx500.h | 69 +++++
drivers/power/Makefile | 2 +-
drivers/power/ab8500_bm.c | 341 +++++++++++++++++++++++
drivers/power/ab8500_bmdata.c | 519 -----------------------------------
drivers/power/ab8500_btemp.c | 5 +-
drivers/power/ab8500_fg.c | 4 +-
include/linux/mfd/abx500.h | 6 +-
include/linux/mfd/abx500/ab8500-bm.h | 5 +
include/linux/power/ab8500.h | 189 +++++++++++++
15 files changed, 1380 insertions(+), 526 deletions(-)
create mode 100644 Documentation/hwmon/ab8500
create mode 100644 Documentation/hwmon/abx500
create mode 100644 drivers/hwmon/ab8500.c
create mode 100644 drivers/hwmon/abx500.c
create mode 100644 drivers/hwmon/abx500.h
create mode 100644 drivers/power/ab8500_bm.c
delete mode 100644 drivers/power/ab8500_bmdata.c
create mode 100644 include/linux/power/ab8500.h
--
1.8.0
---------- Forwarded message ----------
From: <linaro-kernel-bounces(a)lists.linaro.org>
Date: 18 March 2013 22:31
Subject: Auto-discard notification
To: linaro-kernel-owner(a)lists.linaro.org
The attached message has been automatically discarded by lists.linaro.org and
hence i am forwarding it again.
--
viresh
---------- Forwarded message ----------
From: Bruce Dawson <bruced(a)valvesoftware.com>
To: 'Vincent Guittot' <vincent.guittot(a)linaro.org>, Viresh Kumar
<viresh.kumar(a)linaro.org>
Cc: Dave Jones <davej(a)redhat.com>, "cpufreq(a)vger.kernel.org"
<cpufreq(a)vger.kernel.org>, "Rafael J. Wysocki" <rjw(a)sisk.pl>,
"linaro-kernel(a)lists.linaro.org" <linaro-kernel(a)lists.linaro.org>
Date: Mon, 18 Mar 2013 17:01:17 +0000
Subject: RE: CPU power management bug -- CPU bound task fails to raise
CPU frequency
I guess that makes sense for the scheduler to look for the idlest CPU
in the system. That's good to know.
I had guessed that the scheduler would do something like that and that
my test would run on two cores. However I find that bash alternates
between two cores, which seems odd. Additionally, each invocation of
expr starts on one core and moves to another, which seems odd that
each invocation lives for a ms or less. The net effect is that six or
more different cores get involved.
Anyway, it is a pathological case, so maybe it doesn't matter, but
given the popularity of $(command) in shell scripts it may not be
completely irrelevant either.
-----Original Message-----
From: Vincent Guittot [mailto:vincent.guittot@linaro.org]
Sent: Monday, March 18, 2013 4:07 AM
To: Viresh Kumar
Cc: Bruce Dawson; Dave Jones; cpufreq(a)vger.kernel.org; Rafael J.
Wysocki; linaro-kernel(a)lists.linaro.org
Subject: Re: CPU power management bug -- CPU bound task fails to raise
CPU frequency
On 18 March 2013 06:04, Viresh Kumar <viresh.kumar(a)linaro.org> wrote:
> Let me get in the scheduler expert from Linaro (Vincent Guittot, would
> be available after few hours)
>
> Vincent, please start reading from this mail:
>
> http://permalink.gmane.org/gmane.linux.kernel.cpufreq/9675
>
> Now, we want to understand how to make this task perform better as
> scheduler is using multiple cpus for it and hence all are staying at
> low freqs, as load isn't enough..
Hi,
Your 1st test creates a task to evaluate each expr, and the fork
sequence of the scheduler looks for the idlest CPU in the system.
That's explain why your test is evenly spread on all CPUs and the
average load of each CPU is below the threshold of cpufreq At the
opposite, your 2nd test uses only one task which stays on one CPU and
trig the frequency increase.
I would say that the scheduler behavior is almost normal : spread to
get best performance (even if in this use case, the threads run
sequentially) but you have this side effect on the cpufreq thats sees
each core individually as not loaded. This example tends to push in
favor of a better cooperation between scheduler and cpufreq for
sharing statistics
Vincent
>
> --
> viresh
>
> On 18 March 2013 10:28, Bruce Dawson <bruced(a)valvesoftware.com> wrote:
>> This is with the Ondemand governor.
>>
>> The more I ponder this the more I think that the real issue is not the frequency drivers, but the kernel scheduler. The shell script involves two processes being alive at any given time, and one process running (since bash always waits for expr to finish). Therefore the entire task should run on either one core or on two. Instead I see (from looking at thread scheduling graphed using the Zoom profiler -- http://www.rotateright.com/) that it runs on six different cores. bash alternates between two cores, and each invocation of expr is started on one core and then moves to another. Given that it seems not surprising that the CPU frequency management doesn't trigger.
>>
>>> So, the frequency might not be increased if there are multiple cpus
>>> running for a specific task and none of them has high enough load at
>>> that time
>>
>> Yep, that's what I figured. Each cpu's load is quite low -- 20% or lower -- because the work is so spread out.
>>
>> If I run the entire thing under "taskset 1" then everything runs on one core, the frequency elevation happens, and the entire task runs roughly three times faster.
>>
>> Crazy/cool.
>>
>> -----Original Message-----
>> From: viresh.linux(a)gmail.com [mailto:viresh.linux@gmail.com] On
>> Behalf Of Viresh Kumar
>> Sent: Sunday, March 17, 2013 9:32 PM
>> To: Bruce Dawson
>> Cc: Dave Jones; cpufreq(a)vger.kernel.org; Rafael J. Wysocki;
>> linaro-kernel(a)lists.linaro.org
>> Subject: Re: CPU power management bug -- CPU bound task fails to
>> raise CPU frequency
>>
>> On Sun, Mar 17, 2013 at 6:37 AM, Bruce Dawson <bruced(a)valvesoftware.com> wrote:
>>> Dave/others, I've come up with a simple (and real) scenario where a CPU bound task running on Ubuntu (and presumably other Linux flavors) fails to be detected as CPU bound by the Linux kernel, meaning that the CPU continues to run at low speed, meaning that this CPU bound task takes (on my machines) about three times longer to run than it should.
>>>
>>> I found these e-mail addresses in the MAINTAINERS list under CPU FREQUENCY DRIVERS which I'm hoping is the correct area.
>>
>> Yes, cpufreq mailing list is the right list for this.
>>
>>> The basic problem is that on a multi-core system if you run a shell script that spawns lots of sub processes then the workload ends up distributed across all of the CPUs. Therefore, since none of the CPUs are particularly busy, the Linux kernel doesn't realize that a CPU bound task is running, so it leaves the CPU frequency set to low. I have confirmed the behavior in multiple ways. Specifically, I have used "iostat 1" and "mpstat -P ALL 1" to confirm that a full core's worth of CPU work is being done. mpstat also showed that the work was distributed across multiple cores. Using the zoom profiler UI for perf showed the sub processes (and bash) being spread across multiple cores, and perf stat showed that the CPU frequency was staying low even though the task was CPU bound.
>>
>> There are few things which would be helpful to understand what's going on.
>> What governor is used in your case? Probably Ondemand (My ubuntu uses this).
>>
>> Ideally, cpu frequency is increased only if cpu load is very high (or
>> above threshold,
>> 95 in my ubuntu). So, the frequency might not be increased if there are multiple cpus running for a specific task and none of them has high enough load at that time.
>>
>> Other stuff that i suspect here is a bug which was solved recently by
>> below patch. If
>> policy->cpu (that might be cpu 0 for you) is sleeping, then load is
>> never evaluated even
>> if all other cpus are very busy. If you can try below patch then it might be helpful. BTW, you might not be able to apply it easily as it has got lots of dependencies.. and so you might need to pick all drivers/cpufreq patches from v3.9-rc1.
>>
>> commit 2abfa876f1117b0ab45f191fb1f82c41b1cbc8fe
>> Author: Rickard Andersson <rickard.andersson(a)stericsson.com>
>> Date: Thu Dec 27 14:55:38 2012 +0000
>>
>> cpufreq: handle SW coordinated CPUs
>>
>> This patch fixes a bug that occurred when we had load on a secondary CPU
>> and the primary CPU was sleeping. Only one sampling timer was spawned
>> and it was spawned as a deferred timer on the primary CPU, so when a
>> secondary CPU had a change in load this was not detected by the cpufreq
>> governor (both ondemand and conservative).
>>
>> This patch make sure that deferred timers are run on all CPUs in the
>> case of software controlled CPUs that run on the same frequency.
>>
>> Signed-off-by: Rickard Andersson <rickard.andersson(a)stericsson.com>
>> Signed-off-by: Fabio Baltieri <fabio.baltieri(a)linaro.org>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
>> ---
>> drivers/cpufreq/cpufreq_conservative.c | 3 ++-
>> drivers/cpufreq/cpufreq_governor.c | 44
>> ++++++++++++++++++++++++++++++++++++++------
>> drivers/cpufreq/cpufreq_governor.h | 1 +
>> drivers/cpufreq/cpufreq_ondemand.c | 3 ++-
>> 4 files changed, 43 insertions(+), 8 deletions(-)
>>
>>
>>> I have only reproed this behavior on six-core/twelve-thread systems. I would assume that at least a two-core system would be needed to repro this bug, and perhaps more. The bug will not repro if the system is not relatively idle, since a background CPU hog will force the frequency up.
>>>
>>> The repro is exquisitely simple -- ExprCount() is a simplified version of the repro (portable looping in a shell script) and BashCount() is an optimized and less portable version that runs far faster and also avoids this power management problem -- the CPU frequency is raised appropriately. Running a busy loop in another process is another way to get the frequency up and this makes ExprCount() run ~3x faster. Here is the script:
>>>
>>> --------------------------------------
>>> #!/bin/bash
>>> function ExprCount() {
>>> i=$1
>>> while [ $i -gt 0 ]; do
>>> i=$(expr $i - 1)
>>
>> I may be wrong but one cpu is used to run this script and other one would be used to run expr program.. So, 2 cpus should be good enough to reproduce this setup.
>>
>> BTW, i have tried your scripts and was able to reproduce the setup
>> here on a 2 cpu
>> 4 thread system.
>>
>> --
>> viresh
=== Highlights ===
* My Linaro connect talk was covered on lwn here:
https://lwn.net/Articles/542466/
* Worked with tglx to try to sort out remaining issues on reducing
timekeeping lock hold time
* Got my CLOCK_TAI patches properly tested and ready for 3.10
* Got a bunch of community timekeeping changes queued for 3.10
* Re-reported git server errors folks in the community were seeing w/
our trees.
* Sent out weekly email Android Upstreaming status report (canned
hangout, since we all were in HK).
* Reviewed Mathieu's sysrq timeout patch.
* Pointed Axel at ioctl work being done in trinity, and he pointed out
that others are extending it for Android testing.
* Preped and setup interviews with two Linaro candidates.
* Looked over DmitryP's netfilter patch queue, and sent back some minor
comments.
* Got the linaro.android branch updated to 3.9, using the
experimental/android-3.9 branch on AOSP. Thanks to Tixy for helping with
testing!
* Took a look at the ion driver to see what shape it was in for
upstreaming. It has some arm-specific assumptions, so it may need some
rework.
* Synced with Greg and AndrzejP on the removal of CCG from staging
(since its basically abandoned for the configfs gadget). Benoit from
Google chimed in as well here.
* Did an initial review of Serban's binder patches
=== Plans ===
* Sort out Linaro Connect expense reporting
* Another Linaro candidate interview.
* Review new volatile range patches from Minchan
* Try to finish up timekeeping lock hold time reductions
* Send pull request to tglx for my 3.10 queue
* Work on earlysuspend blog post
* Maybe send out pstore updates from Google?
=== Issues ===
* Jet lag returning from Linaro Connect was pretty bad. Think I'm
finally over it though.
On Sun, Mar 17, 2013 at 6:37 AM, Bruce Dawson <bruced(a)valvesoftware.com> wrote:
> Dave/others, I've come up with a simple (and real) scenario where a CPU bound task running on Ubuntu (and presumably other Linux flavors) fails to be detected as CPU bound by the Linux kernel, meaning that the CPU continues to run at low speed, meaning that this CPU bound task takes (on my machines) about three times longer to run than it should.
>
> I found these e-mail addresses in the MAINTAINERS list under CPU FREQUENCY DRIVERS which I'm hoping is the correct area.
Yes, cpufreq mailing list is the right list for this.
> The basic problem is that on a multi-core system if you run a shell script that spawns lots of sub processes then the workload ends up distributed across all of the CPUs. Therefore, since none of the CPUs are particularly busy, the Linux kernel doesn't realize that a CPU bound task is running, so it leaves the CPU frequency set to low. I have confirmed the behavior in multiple ways. Specifically, I have used "iostat 1" and "mpstat -P ALL 1" to confirm that a full core's worth of CPU work is being done. mpstat also showed that the work was distributed across multiple cores. Using the zoom profiler UI for perf showed the sub processes (and bash) being spread across multiple cores, and perf stat showed that the CPU frequency was staying low even though the task was CPU bound.
There are few things which would be helpful to understand what's going on.
What governor is used in your case? Probably Ondemand (My ubuntu uses this).
Ideally, cpu frequency is increased only if cpu load is very high (or
above threshold,
95 in my ubuntu). So, the frequency might not be increased if there
are multiple cpus
running for a specific task and none of them has high enough load at that time.
Other stuff that i suspect here is a bug which was solved recently by
below patch. If
policy->cpu (that might be cpu 0 for you) is sleeping, then load is
never evaluated even
if all other cpus are very busy. If you can try below patch then it
might be helpful. BTW,
you might not be able to apply it easily as it has got lots of
dependencies.. and so you
might need to pick all drivers/cpufreq patches from v3.9-rc1.
commit 2abfa876f1117b0ab45f191fb1f82c41b1cbc8fe
Author: Rickard Andersson <rickard.andersson(a)stericsson.com>
Date: Thu Dec 27 14:55:38 2012 +0000
cpufreq: handle SW coordinated CPUs
This patch fixes a bug that occurred when we had load on a secondary CPU
and the primary CPU was sleeping. Only one sampling timer was spawned
and it was spawned as a deferred timer on the primary CPU, so when a
secondary CPU had a change in load this was not detected by the cpufreq
governor (both ondemand and conservative).
This patch make sure that deferred timers are run on all CPUs in the
case of software controlled CPUs that run on the same frequency.
Signed-off-by: Rickard Andersson <rickard.andersson(a)stericsson.com>
Signed-off-by: Fabio Baltieri <fabio.baltieri(a)linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
---
drivers/cpufreq/cpufreq_conservative.c | 3 ++-
drivers/cpufreq/cpufreq_governor.c | 44
++++++++++++++++++++++++++++++++++++++------
drivers/cpufreq/cpufreq_governor.h | 1 +
drivers/cpufreq/cpufreq_ondemand.c | 3 ++-
4 files changed, 43 insertions(+), 8 deletions(-)
> I have only reproed this behavior on six-core/twelve-thread systems. I would assume that at least a two-core system would be needed to repro this bug, and perhaps more. The bug will not repro if the system is not relatively idle, since a background CPU hog will force the frequency up.
>
> The repro is exquisitely simple -- ExprCount() is a simplified version of the repro (portable looping in a shell script) and BashCount() is an optimized and less portable version that runs far faster and also avoids this power management problem -- the CPU frequency is raised appropriately. Running a busy loop in another process is another way to get the frequency up and this makes ExprCount() run ~3x faster. Here is the script:
>
> --------------------------------------
> #!/bin/bash
> function ExprCount() {
> i=$1
> while [ $i -gt 0 ]; do
> i=$(expr $i - 1)
I may be wrong but one cpu is used to run this script and other one
would be used
to run expr program.. So, 2 cpus should be good enough to reproduce this setup.
BTW, i have tried your scripts and was able to reproduce the setup
here on a 2 cpu
4 thread system.
--
viresh
=== David Long ===
=== Highlights ===
* Attended Connect. The keynotes were particulaly good this time.
Kudos to all who arranged this and those who arranged the weather (not
that I got outside much).
* Although the uprobe code is installing and taking the breakpoint
properly, it is getting lost somewhere when it goes to execute the
probed instruction out-of-line. I've tried to isolate this a couple of
different ways, but no success yet.
=== Plans ===
* Continue isolating the xol problem and restructing the emulation code.
=== Issues ===
-dl
Hi Guys,
Below are hangout upstreams of Scheduler Internals by Vincent Guittot
done in LCA13.
We have got another version of this recording that is done by some
other cameras, but
its size was 30 GB and so hard to upstream. In case you need that
please contact me.
Day 1: http://www.youtube.com/watch?v=2yzelou80JE
Day 2: http://www.youtube.com/watch?v=fN11Lltx1nQ
Thanks to Naresh for arranging for hangouts.
--
Viresh
This is to let you know that the migration of lists.linaro.org has been
successfully completed.
As per the email I sent on Wednesday, it may take some time for the new
address of the server to be seen by your computer. You can check this by
trying to connect to the web site:
http://lists.linaro.org/
If you are able to connect and you do not get an error, this means you are
connecting to the new server and you can send email to the lists.
If you experience any problems after the weekend and you find that you
still cannot connect to the server, please reply to this email to let us
know.
Regards
Philip
IT Services Manager
Linaro
== Ulf Hansson ==
=== Highlights ===
General:
* Last week spent at Linaro Connect Hong Kong. A great week!
Storage:
* Reviewing patches on mmc-list.
* Discussing sent patchset to enable runtime pm support for mmc/sd block device.
* Rework parts of the HS200 and SDR104 support in the mmc protocol
layer. First part for tuning sequence done, patch will be pushed to
mmc-list shortly.
Clk:
* Preparing patchset for upstreaming patches that will add support for
abx500 clocks.
* Preparing patchset to update different driver's clk support used by ux500.
* Resent patches for clk_set_parent fixup and for disable unprepared
clocks at late init.
* Diving into discussion around doing DVFS through the clock API.
Reviewing related patches.
=== Plans ===
Storage:
* Speed up work around the mmc power management blueprint so we can
finalize this work as soon as possible.
* Push patches for mmci host driver, to support UHS cards/HS200, CMD23.
* Push patches for mmci host driver to extend the power management
support. Context save/restore are for example missing.
* Push patches for mmci host driver to add support for new STE 8540 variant.
Clk:
* Optimizations and bug fixes for ux500 clk implementations.
* Send RFC/PATCH for DVFS clock type used by ux500.
* Follow up on previously sent patchset.
=== Issues ===
* None.
On Fri, Mar 15, 2013 at 11:54 AM, Chao Xie <xiechao.mail(a)gmail.com> wrote:
> hi
> It may be a old topic.
> Now the cpufreq governors will sample for system work load. The
> schduler knows about the current workload of each cores. So why not
> make use of it? The sampling need take some time, so when the cpufreq
> increase the frequency , the system has been busy for a period of
> time. Making use of the schduler information can reduce time spending
> at sampling.
Hi Chao,
I am working for Linaro Power Management Working Group and we know
about this problem or solution. We have a dedicated blueprint towards this
goal:
https://blueprints.launchpad.net/linaro-big-little-system/+spec/sched-coope…
--
viresh
Replaces the "platform_get_resource() for IORESOURCE_IRQ" with
platform_get_resource_byname().
Both in exynos4 and exynos5, FIMD IP has 3 interrupts in the order: "fifo",
"vsync", and "lcd_sys".
But The FIMD driver expects the "vsync" interrupt to be mentioned as the
1st parameter in the FIMD DT node. So to meet this expectation of the
driver, the FIMD DT node was forced to be made by keeping "vsync" as the
1st paramter.
For example in exynos4, the FIMD DT node has interrupt numbers
mentioned as <11, 1> <11, 0> <11, 2> keeping "vsync" as the 1st paramter.
This patch fixes the above mentioned "hack" of re-ordering of the
FIMD interrupt numbers by getting interrupt resource of FIMD by using
platform_get_resource_byname().
Signed-off-by: Vikas Sajjan <vikas.sajjan(a)linaro.org>
---
drivers/gpu/drm/exynos/exynos_drm_fimd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
index 1ea173a..cd79d38 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
@@ -945,7 +945,7 @@ static int fimd_probe(struct platform_device *pdev)
return -ENXIO;
}
- res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
+ res = platform_get_resource_byname(pdev, IORESOURCE_IRQ, "vsync");
if (!res) {
dev_err(dev, "irq request failed.\n");
return -ENXIO;
--
1.7.9.5
This patch series adds support for DRM FIMD DT for Exynos4 DT Machines,
specifically for Exynos4412 SoC.
changes since v6:
- addressed comments and added interrupt-names = "fifo", "vsync", "lcd_sys"
in exynos4.dtsi and re-ordered the interrupt numbering to match the order in
interrupt combiner IP as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>.
changes since v5:
- renamed the fimd binding documentation file name as "samsung-fimd.txt",
since it not only talks about exynos display controller but also about
previous samsung display controllers.
- rephrased an abmigious statement about the interrupt combiner in the
fimd binding documentation as pointed out by
Sachin Kamat <sachin.kamat(a)linaro.org>
changes since v4:
- moved the fimd binding documentation to Documentation/devicetree/bindings/video/
as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>
- added more fimd compatiblity strings in fimd documentation as
discussed at https://patchwork.kernel.org/patch/2144861/ with
Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com> and
Tomasz Figa <tomasz.figa(a)gmail.com>
- modified compatible string for exynos4 fimd as "exynos4210-fimd"
exynos5 fimd as "exynos5250-fimd" to stick to the rule that compatible
value should be named after first specific SoC model in which this
particular IP version was included as discussed at
https://patchwork.kernel.org/patch/2144861/
- documented more about the interrupt combiner and their order as
suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>
changes since v3:
- rebased on
http://git.kernel.org/?p=linux/kernel/git/kgene/linux-samsung.git;a=shortlo…
changes since v2:
- added alias to 'fimd@11c00000' node
(reported by: Rahul Sharma <r.sh.open(a)gmail.com>)
- removed 'lcd0_data' node as there was already a similar node lcd_data24
(reported by: Jingoo Han <jg1.han(a)samsung.com>
- replaced spaces with tabs in display-timing node
changes since v1:
- added new patch to add FIMD DT binding Documentation
- removed patch enabling SAMSUNG_DEV_BACKLIGHT and SAMSUNG_DEV_PMW
for mach-exynos4 DT
- added 'status' property to fimd DT node
Is based on branch "for-next-next"
http://git.kernel.org/?p=linux/kernel/git/kgene/linux-samsung.git;a=shortlo…
Sachin Kamat (1):
ARM: dts: Add lcd pinctrl node entries for EXYNOS4412 SoC
Vikas Sajjan (4):
ARM: dts: Add FIMD node to exynos4
ARM: dts: Add FIMD node and display timing node to
exynos4412-origen.dts
ARM: dts: Add FIMD AUXDATA node entry for exynos4 DT
ARM: dts: Add FIMD DT binding Documentation
.../devicetree/bindings/video/samsung-fimd.txt | 58 ++++++++++++++++++++
arch/arm/boot/dts/exynos4.dtsi | 8 +++
arch/arm/boot/dts/exynos4412-origen.dts | 22 ++++++++
arch/arm/boot/dts/exynos4x12-pinctrl.dtsi | 14 +++++
arch/arm/mach-exynos/mach-exynos4-dt.c | 2 +
5 files changed, 104 insertions(+)
create mode 100644 Documentation/devicetree/bindings/video/samsung-fimd.txt
--
1.7.9.5
Hello
You are receiving this email because you are subscribed to one or more
mailing lists provided by the lists.linaro.org server.
IT Services are announcing planned maintenance for this server scheduled
for *Friday 15th March 2013, starting at 2pm GMT*. The purpose of the work
is to move the service to another server. There will be some disruption
during this maintenance.
In order to ensure that you do not accidentally try to use the service
while it is being moved, the current server will be shut down at 2pm.
A further email will be sent on Friday afternoon to confirm that the
migration of the service is completed. However, due to the way servers are
found, it may take a while before your computer is able to connect to the
relocated service.
After the old server has been shut down, email sent to any of the lists
will be queued, but it is possible that the sending server will still
trying to deliver the email to the old server rather than the new one when
it is started.
It is therefore *strongly* recommended that you do not send any email to an
@lists.linaro.org email address until you can connect to the new service,
which you will be able to test by trying to use a web browser to connect to
http://lists.linaro.org after you receive the email confirming that the
migration has been completed. Since the old service will be shut down, if
you are able to connect, you can be sure you have connected to the new
service.
If by Monday you are still unable to connect to the service or you are not
able to send email to an @lists.linaro.org email address, please send an
email to its(a)linaro.org.
Thank you.
Regards
Philip
IT Services Manager
Linaro
The only difference between schedule_delayed_work[_on]() and
queue_delayed_work[_on]() is the workqueue, work is scheduled on. We may need to
modify the delay for works queued with schedule_delayed_work[_on]() calls and
thus adding these helpers.
First users of these new helpers is cpufreq governors which need to modify the
delay for its works.
Cc: Tejun Heo <tj(a)kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
include/linux/workqueue.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 2b58905..864c2b3 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -412,6 +412,7 @@ extern bool schedule_delayed_work_on(int cpu, struct delayed_work *work,
extern bool schedule_delayed_work(struct delayed_work *work,
unsigned long delay);
extern int schedule_on_each_cpu(work_func_t func);
+
extern int keventd_up(void);
int execute_in_process_context(work_func_t fn, struct execute_work *);
@@ -465,6 +466,11 @@ static inline long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg);
#endif /* CONFIG_SMP */
+#define mod_scheduled_delayed_work_on(cpu, dwork, delay) \
+ mod_delayed_work_on(cpu, system_wq, dwork, delay)
+#define mod_scheduled_delayed_work(dwork, delay) \
+ mod_delayed_work(system_wq, dwork, delay)
+
#ifdef CONFIG_FREEZER
extern void freeze_workqueues_begin(void);
extern bool freeze_workqueues_busy(void);
--
1.7.12.rc2.18.g61b472e
When a cpu goes to a deep idle state where its local timer is shutdown,
it notifies the time framework to use the broadcast timer instead.
Unfortunately, the broadcast device could wake up any CPU, including an
idle one which is not concerned by the wake up at all.
This implies, in the worst case, an idle CPU will wake up to send an IPI
to another idle cpu.
This patch solves this by setting the irq affinity to the cpu concerned
by the nearest timer event, by this way, the CPU which is wake up is
guarantee to be the one concerned by the next event and we are safe with
unnecessary wakeup for another idle CPU.
As the irq affinity is not supported by all the archs, a flag is needed
to specify which clocksource can handle it.
Daniel Lezcano (3):
time : pass broadcast parameter
time : set broadcast irq affinity
ARM: nomadik: add dynamic irq flag to the timer
Viresh Kumar (1):
ARM: timer-sp: Set dynamic irq affinity
arch/arm/common/timer-sp.c | 3 ++-
drivers/clocksource/nomadik-mtu.c | 3 ++-
include/linux/clockchips.h | 1 +
kernel/time/tick-broadcast.c | 40 +++++++++++++++++++++++++++++--------
4 files changed, 37 insertions(+), 10 deletions(-)
--
1.7.9.5
Add display-timing node parsing to drm fimd and depends on
the display helper patchset at
http://lists.freedesktop.org/archives/dri-devel/2013-January/033998.html
changes since v12:
- Added dependency of "OF" for exynos drm fimd as suggested
by Inki Dae <inki.dae(a)samsung.com>
changes since v11:
- Oops, there was a build error, fixed that.
changes since v10:
- abandoned the pinctrl patch, as commented by Linus Walleij
<linus.walleij(a)linaro.org>
- added new patch to enable the OF_VIDEOMODE and FB_MODE_HELPERS for
EXYNOS DRM FIMD.
changes since v9:
- replaced IS_ERR_OR_NULL() with IS_ERR(), since IS_ERR_OR_NULL()
will be depreciated, as discussed at
http://lists.infradead.org/pipermail/linux-arm-kernel/2013-January/140543.h…http://www.mail-archive.com/linux-omap@vger.kernel.org/msg78030.html
changes since v8:
- replaced IS_ERR() with IS_ERR_OR_NULL(),
because devm_pinctrl_get_select_default can return NULL,
If CONFIG_PINCTRL is disabled.
- modified the error log, such that it shall NOT cross 80 column.
- added Acked-by.
changes since v7:
- addressed comments from Joonyoung Shim <jy0922.shim(a)samsung.com>
to remove a unnecessary variable.
changes since v6:
addressed comments from Inki Dae <inki.dae(a)samsung.com> to
separated out the pinctrl functionality and made a separate patch.
changes since v5:
- addressed comments from Inki Dae <inki.dae(a)samsung.com>,
to remove the allocation of 'fbmode' and replaced
'-1'in "of_get_fb_videomode(dev->of_node, fbmode, -1)" with
OF_USE_NATIVE_MODE.
changes since v4:
- addressed comments from Paul Menzel
<paulepanter(a)users.sourceforge.net>, to modify the commit message
changes since v3:
- addressed comments from Sean Paul <seanpaul(a)chromium.org>, to modify
the return values and print messages.
changes since v2:
- moved 'devm_pinctrl_get_select_default' function call under
'if (pdev->dev.of_node)', this makes NON-DT code unchanged.
(reported by: Rahul Sharma <r.sh.open(a)gmail.com>)
changes since v1:
- addressed comments from Sean Paul <seanpaul(a)chromium.org>
Vikas Sajjan (2):
video: drm: exynos: Add display-timing node parsing using video
helper function
drm/exynos: enable OF_VIDEOMODE and FB_MODE_HELPERS for exynos drm
fimd
drivers/gpu/drm/exynos/Kconfig | 4 +++-
drivers/gpu/drm/exynos/exynos_drm_fimd.c | 24 ++++++++++++++++++++----
2 files changed, 23 insertions(+), 5 deletions(-)
--
1.7.9.5
Hi,
I'm developing a baremetal hypervisor for Cortex-A15 and I'm using
Arndale Board as a reference platform. ATM, I'm working on running
Linaro 13.02 as a guest.
I've noticed that `arch/arm/kernel/arch_timer.c` Linux prefers the
virtual IRQ timers (CNTV_):
static bool arch_timer_use_virtual = false;
as a timer source and I'm wondering if there's a reason for this. Is
it even possible for virtual timer to be available, while physical one
is not? I don't think so. So why would Linux want to default to
virtual one instead of physical one, especially in the situation when
it was started in SVC mode and HYP is not accessible to it?
>From Hypervisor point of view, it seems natural that guest does not
care about virtualization stuff, does not try to be smarter than
necessary and simply uses physical timer (PL1) all the time.
It's hypervisor's job to trap guest accesses to physical timer (CNTP_)
and route them to virtual timer (CNTV_) so virtual offset (CNTVOFF)
value can be easily switched (again, by hypervisor, not the guest)
when handling multiple guests, to correctly handle virtual time for
them. In opposite direction the hardware virtual timer irq is just
renumbered to appear to guest as a physical one.
While emulating the CNTVOFF and CNTV_ is entirely possible too and
required for completeness, it seems like additional work for no
apparent reason. So I wanted to clarify this. Maybe there is something
that I don't understand or I misread the intentions described in
reference manual.
--
Dawid Ciężarkiewicz
B-Labs. http://b-labs.com/
Add display-timing node parsing to drm fimd and depends on
the display helper patchset at
http://lists.freedesktop.org/archives/dri-devel/2013-January/033998.html
changes since v11:
- Oops, there was a build error, fixed that.
changes since v10:
- abandoned the pinctrl patch, as commented by Linus Walleij
<linus.walleij(a)linaro.org>
- added new patch to enable the OF_VIDEOMODE and FB_MODE_HELPERS for
EXYNOS DRM FIMD.
changes since v9:
- replaced IS_ERR_OR_NULL() with IS_ERR(), since IS_ERR_OR_NULL()
will be depreciated, as discussed at
http://lists.infradead.org/pipermail/linux-arm-kernel/2013-January/140543.h…http://www.mail-archive.com/linux-omap@vger.kernel.org/msg78030.html
changes since v8:
- replaced IS_ERR() with IS_ERR_OR_NULL(),
because devm_pinctrl_get_select_default can return NULL,
If CONFIG_PINCTRL is disabled.
- modified the error log, such that it shall NOT cross 80 column.
- added Acked-by.
changes since v7:
- addressed comments from Joonyoung Shim <jy0922.shim(a)samsung.com>
to remove a unnecessary variable.
changes since v6:
addressed comments from Inki Dae <inki.dae(a)samsung.com> to
separated out the pinctrl functionality and made a separate patch.
changes since v5:
- addressed comments from Inki Dae <inki.dae(a)samsung.com>,
to remove the allocation of 'fbmode' and replaced
'-1'in "of_get_fb_videomode(dev->of_node, fbmode, -1)" with
OF_USE_NATIVE_MODE.
changes since v4:
- addressed comments from Paul Menzel
<paulepanter(a)users.sourceforge.net>, to modify the commit message
changes since v3:
- addressed comments from Sean Paul <seanpaul(a)chromium.org>, to modify
the return values and print messages.
changes since v2:
- moved 'devm_pinctrl_get_select_default' function call under
'if (pdev->dev.of_node)', this makes NON-DT code unchanged.
(reported by: Rahul Sharma <r.sh.open(a)gmail.com>)
changes since v1:
- addressed comments from Sean Paul <seanpaul(a)chromium.org>
Vikas Sajjan (2):
video: drm: exynos: Add display-timing node parsing using video
helper function
drm/exynos: enable OF_VIDEOMODE and FB_MODE_HELPERS for exynos drm
fimd
drivers/gpu/drm/exynos/Kconfig | 2 ++
drivers/gpu/drm/exynos/exynos_drm_fimd.c | 27 +++++++++++++++++++++++----
2 files changed, 25 insertions(+), 4 deletions(-)
--
1.7.9.5
Has anyone worked on adding 64k page support to the ARM v7 architecture, in particular the ARM Cortex A9? I know that there is HugeTLB support available but I'm wondering if anyone has tried making the native page size 64k rather than 4k. In other words, make PAGE_SHIFT 16 instead of 12 and all of the corresponding changes to the way page tables are handled. Has this been attempted already? Can someone point me to a patch?
Thanks!
David Betz
Add display-timing node parsing to drm fimd and depends on
the display helper patchset at
http://lists.freedesktop.org/archives/dri-devel/2013-January/033998.html
changes since v10:
- abandoned the pinctrl patch, as commented by Linus Walleij
<linus.walleij(a)linaro.org>
- added new patch to enable the OF_VIDEOMODE and FB_MODE_HELPERS for
EXYNOS DRM FIMD.
changes since v9:
- replaced IS_ERR_OR_NULL() with IS_ERR(), since IS_ERR_OR_NULL()
will be depreciated, as discussed at
http://lists.infradead.org/pipermail/linux-arm-kernel/2013-January/140543.h…http://www.mail-archive.com/linux-omap@vger.kernel.org/msg78030.html
changes since v8:
- replaced IS_ERR() with IS_ERR_OR_NULL(),
because devm_pinctrl_get_select_default can return NULL,
If CONFIG_PINCTRL is disabled.
- modified the error log, such that it shall NOT cross 80 column.
- added Acked-by.
changes since v7:
- addressed comments from Joonyoung Shim <jy0922.shim(a)samsung.com>
to remove a unnecessary variable.
changes since v6:
addressed comments from Inki Dae <inki.dae(a)samsung.com> to
separated out the pinctrl functionality and made a separate patch.
changes since v5:
- addressed comments from Inki Dae <inki.dae(a)samsung.com>,
to remove the allocation of 'fbmode' and replaced
'-1'in "of_get_fb_videomode(dev->of_node, fbmode, -1)" with
OF_USE_NATIVE_MODE.
changes since v4:
- addressed comments from Paul Menzel
<paulepanter(a)users.sourceforge.net>, to modify the commit message
changes since v3:
- addressed comments from Sean Paul <seanpaul(a)chromium.org>, to modify
the return values and print messages.
changes since v2:
- moved 'devm_pinctrl_get_select_default' function call under
'if (pdev->dev.of_node)', this makes NON-DT code unchanged.
(reported by: Rahul Sharma <r.sh.open(a)gmail.com>)
changes since v1:
- addressed comments from Sean Paul <seanpaul(a)chromium.org>
Vikas Sajjan (2):
video: drm: exynos: Add display-timing node parsing using video
helper function
drm/exynos: enable OF_VIDEOMODE and FB_MODE_HELPERS for exynos drm
fimd
drivers/gpu/drm/exynos/Kconfig | 2 ++
drivers/gpu/drm/exynos/exynos_drm_fimd.c | 27 +++++++++++++++++++++++----
2 files changed, 25 insertions(+), 4 deletions(-)
--
1.7.9.5
Following patch has introduced per cpu timers or works for ondemand and
conservative governors.
commit 2abfa876f1117b0ab45f191fb1f82c41b1cbc8fe
Author: Rickard Andersson <rickard.andersson(a)stericsson.com>
Date: Thu Dec 27 14:55:38 2012 +0000
cpufreq: handle SW coordinated CPUs
This causes additional unnecessary interrupts on all cpus when the load is
recently evaluated by any other cpu. i.e. When load is recently evaluated by cpu
x, we don't really need any other cpu to evaluate this load again for the next
sampling_rate time.
Some sort of code is present to avoid that but we are still getting timer
interrupts for all cpus. A good way of avoiding this would be to modify delays
for all cpus (policy->cpus) whenever any cpu has evaluated load.
This patchset tries to fix this issue.
These patches are applied here:
http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/h…
V1->V2:
- Dropped Workqueue modifications and use system_wq directly.
Viresh Kumar (2):
cpufreq: ondemand: Don't update sample_type if we don't evaluate load
again
cpufreq: governors: Avoid unnecessary per cpu timer interrupts
drivers/cpufreq/cpufreq_conservative.c | 8 ++++---
drivers/cpufreq/cpufreq_governor.c | 39 ++++++++++++++++++++++++----------
drivers/cpufreq/cpufreq_governor.h | 2 ++
drivers/cpufreq/cpufreq_ondemand.c | 34 ++++++++++++++---------------
4 files changed, 52 insertions(+), 31 deletions(-)
--
1.7.12.rc2.18.g61b472e
=== David Long ===
=== Highlights ===
* I'm unfortunately still trying to sort out what chat in basic uprobes
support is causing the upleveled ARM uprobe patch to behave
incorrectly. I still think it's a small issue, it's just a matter of
tracking it down.
* Spent a little time getting ready for Connect.
=== Plans ===
* Connect-specific activities will occupy most of my time in the coming
week.
* Find the problem in the upleved patch and move forward.
=== Issues ===
-dl
=== Highlights ===
* Got my slides ready for connect
* Acked Serban's reduced scope ashmem compat_ioctl patch
* Saw the experimental/android-3.8 branch showed up and send Arve a
build fix patch
* Reviewed the android-3.8 branch and found some staging fixes that
should probably go upstream. After consulting the Google devs, sent them
on to lkml. Little bit of contention on the list, so I may have to drop
one and resend the rest.
* Sorted out ABS travel expenses
* Pinged Jason Wessel on Anton's KDB patches
* Updated blueprints and held bi-weekly Android Upstreaming hangout meeting.
* Sent out RFD on submitting the sync driver to staging. Got no
objections, so submitted the patches. Greg seemed ok with most of it,
likely to be merged for 3.10
* Got invited to, but declined a conference on the future of UTC.
* Spent some time testing valid-interval lock concept for timekeeping
* Did packing and prep for connect.
=== Plans ===
* Linaro Connect mostly.
* Still need to review Serban's binder patches
* Still need Look into Androids support of large-files with 32bit
applications
* Need to look into some community bugs I've been ignoring.
=== Issues ===
* NA
== Ulf Hansson ==
=== Highlights ===
Storage:
* Reviewing patches on mmc-list.
* Patches for fixing signal voltage switch procedure for SD card UHS
mode ready and merged by Chris.
* Rework parts of the HS200 and SDR104 support in the mmc protocol
layer. First part for tuning sequence done, patch will be pushed to
mmc-list shortly.
* Sent patch to enable runtime pm support for mmc/sd block device. The
intention is let Idle time BKOPS build upon this.
Clk:
* Started to prepare a patchset for upstreaming patches that will add
support for abx500 clocks, update different driver's clk support and
include ux500 clk optimizations.
* Reviewing DVFS clock related patches from Mike Turquette.
Interesting work here.
=== Plans ===
Storage:
* Doing an overall analyse about the eMMC 4.5/4.6 features. Check what
can be considered finished, what needs further fixing and point out
the new features for which we should spend our focus on in Linaro
storage team.
* Keep focus on the mmc power management blueprint so we can finalize
this work as soon as possible.
* Push patches for mmci host driver to support UHS cards.
* Push patches for mmci host driver to further extend the power
management support.
* Push patches for mmci host driver to add new features like CMD23
support and more.
* Push patches for mmci host driver to add support for new STE 8540 variant.
Clk:
* Upstreaming of internal work for ux500.
* Follow up on patchset for fixing clk_set_parent API.
* Follow up on patchset for disable unsed prepared clks.
=== Issues ===
* None.
Kind regards
Ulf Hansson
Add display-timing node parsing to drm fimd and depends on
the display helper patchset at
http://lists.freedesktop.org/archives/dri-devel/2013-January/033998.html
It also adds pinctrl support for drm fimd.
changes since v9:
- replaced IS_ERR_OR_NULL() with IS_ERR(), since IS_ERR_OR_NULL()
will be depreciated, as discussed at
http://lists.infradead.org/pipermail/linux-arm-kernel/2013-January/140543.h…http://www.mail-archive.com/linux-omap@vger.kernel.org/msg78030.html
changes since v8:
- replaced IS_ERR() with IS_ERR_OR_NULL(),
because devm_pinctrl_get_select_default can return NULL,
If CONFIG_PINCTRL is disabled.
- modified the error log, such that it shall NOT cross 80 column.
- added Acked-by.
changes since v7:
- addressed comments from Joonyoung Shim <jy0922.shim(a)samsung.com>
to remove a unnecessary variable.
changes since v6:
addressed comments from Inki Dae <inki.dae(a)samsung.com> to
separated out the pinctrl functionality and made a separate patch.
changes since v5:
- addressed comments from Inki Dae <inki.dae(a)samsung.com>,
to remove the allocation of 'fbmode' and replaced
'-1'in "of_get_fb_videomode(dev->of_node, fbmode, -1)" with
OF_USE_NATIVE_MODE.
changes since v4:
- addressed comments from Paul Menzel
<paulepanter(a)users.sourceforge.net>, to modify the commit message
changes since v3:
- addressed comments from Sean Paul <seanpaul(a)chromium.org>, to modify
the return values and print messages.
changes since v2:
- moved 'devm_pinctrl_get_select_default' function call under
'if (pdev->dev.of_node)', this makes NON-DT code unchanged.
(reported by: Rahul Sharma <r.sh.open(a)gmail.com>)
changes since v1:
- addressed comments from Sean Paul <seanpaul(a)chromium.org>
Vikas Sajjan (2):
video: drm: exynos: Add display-timing node parsing using video
helper function
video: drm: exynos: Add pinctrl support to fimd
drivers/gpu/drm/exynos/exynos_drm_fimd.c | 33 ++++++++++++++++++++++++++----
1 file changed, 29 insertions(+), 4 deletions(-)
--
1.7.9.5
Hi,
Please find a link[1] to some of the things we plan to discuss and
work on in Hong Kong next week.
If you're interested in some of these topics and are attending in
person, please come and say hello.
See you in Hong Kong!
Regards,
Amit
-----------------------------------------------
PMWG Tech Lead, Linaro
[1] https://wiki.linaro.org/WorkingGroups/PowerManagement/Doc/HK_LCA2013