- linaro-kernel - lists.linaro.org

[ACTIVITY] (Linus Walleij) 2013-03-22 - 2013-04-12

by Linus Walleij

== Linus Walleij linusw == === Highlights === * Completed the multiplatform support for ux500 and the result has been pulled into MFD and ARM SoC trees. There is some immediate fallout from this that need to be fixed and other trouble that also need to be fixed up in the ux500 world. * Collected pinctrl fixes and new patches. Sent a pull request for fixes to Torvalds and he pulled it in. * Collected GPIO fixes and new patches. Sent two pull requests for fixes to Torvalds and he pulled them. This took some time due to a huge pile of cleanup patches. * Sent pull requests to ARM SoC for a few ux500 things, probably I am missing some topics still. * Reviewed and merged a few of Fabios backports to the internal ST-Ericsson tree. * Figured out how to do PCI device tree properly and implemented for the Integrator/AP. Patches are pending review but in nice shape - had to hunt down a specific nasty problem with PCI hosts being rootless (no parent device) on ARM, this will not work going forward so proposed a patch and iterated. * I have a pretty big device tree patch bundle for the U300 building up, but want to have it in a more complete state before I post. The plan for U300 is: enable all for device tree, delete board files, multiplatform in that order. === Plans === * Get better at sending these reports every week. * A short paternity leave 6/5->9/5 in may. * Find all regressions for ux500 lurking in the linux-next tree. * Convert Nomadik pinctrl driver to register GPIO ranges from the gpiochip side. * Test the PL08x patches on the Ericsson Research PB11MPCore and submit platform data for using pl08x DMA on that platform. * Get hands dirty with regmap. === Issues === * A bit overloaded, especially hard to keep track of all the ux500 stuff in my head. Could use another co-maintainer maybe. * Things have been hectic internally at ST-Ericsson diverting me from Linaro work. * I am spending roughly 30-60 mins every day on internal review work on internal baseline and mainline patches-to-be. Thanks, Linus Walleij

12 years, 1 month

1
0
0 0

[PATCH Resend v5] sched: fix init NOHZ_IDLE flag

by Vincent Guittot

On my smp platform which is made of 5 cores in 2 clusters, I have the nr_busy_cpu field of sched_group_power struct that is not null when the platform is fully idle. The root cause is: During the boot sequence, some CPUs reach the idle loop and set their NOHZ_IDLE flag while waiting for others CPUs to boot. But the nr_busy_cpus field is initialized later with the assumption that all CPUs are in the busy state whereas some CPUs have already set their NOHZ_IDLE flag. More generally, the NOHZ_IDLE flag must be initialized when new sched_domains are created in order to ensure that NOHZ_IDLE and nr_busy_cpus are aligned. This condition can be ensured by adding a synchronize_rcu between the destruction of old sched_domains and the creation of new ones so the NOHZ_IDLE flag will not be updated with old sched_domain once it has been initialized. But this solution introduces a additionnal latency in the rebuild sequence that is called during cpu hotplug. As suggested by Frederic Weisbecker, another solution is to have the same rcu lifecycle for both NOHZ_IDLE and sched_domain struct. I have introduce a new sched_domain_rq struct that is the entry point for both sched_domains and objects that must follow the same lifecycle like NOHZ_IDLE flags. They will share the same RCU lifecycle and will be always synchronized. The synchronization is done at the cost of : - an additional indirection for accessing the first sched_domain level - an additional indirection and a rcu_dereference before accessing to the NOHZ_IDLE flag. Change since v4: - link both sched_domain and NOHZ_IDLE flag in one RCU object so their states are always synchronized. Change since V3; - NOHZ flag is not cleared if a NULL domain is attached to the CPU - Remove patch 2/2 which becomes useless with latest modifications Change since V2: - change the initialization to idle state instead of busy state so a CPU that enters idle during the build of the sched_domain will not corrupt the initialization state Change since V1: - remove the patch for SCHED softirq on an idle core use case as it was a side effect of the other use cases. Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org> --- include/linux/sched.h | 6 +++ kernel/sched/core.c | 105 ++++++++++++++++++++++++++++++++++++++++++++----- kernel/sched/fair.c | 35 +++++++++++------ kernel/sched/sched.h | 24 +++++++++-- 4 files changed, 145 insertions(+), 25 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index d35d2b6..2a52188 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -959,6 +959,12 @@ struct sched_domain { unsigned long span[0]; }; +struct sched_domain_rq { + struct sched_domain *sd; + unsigned long flags; + struct rcu_head rcu; /* used during destruction */ +}; + static inline struct cpumask *sched_domain_span(struct sched_domain *sd) { return to_cpumask(sd->span); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 7f12624..69e2313 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5602,6 +5602,15 @@ static void destroy_sched_domains(struct sched_domain *sd, int cpu) destroy_sched_domain(sd, cpu); } +static void destroy_sched_domain_rq(struct sched_domain_rq *sd_rq, int cpu) +{ + if (!sd_rq) + return; + + destroy_sched_domains(sd_rq->sd, cpu); + kfree_rcu(sd_rq, rcu); +} + /* * Keep a special pointer to the highest sched_domain that has * SD_SHARE_PKG_RESOURCE set (Last Level Cache Domain) for this @@ -5632,10 +5641,23 @@ static void update_top_cache_domain(int cpu) * hold the hotplug lock. */ static void -cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu) +cpu_attach_domain(struct sched_domain_rq *sd_rq, struct root_domain *rd, + int cpu) { struct rq *rq = cpu_rq(cpu); - struct sched_domain *tmp; + struct sched_domain_rq *tmp_rq; + struct sched_domain *tmp, *sd = NULL; + + /* + * If we don't have any sched_domain and associated object, we can + * directly jump to the attach sequence otherwise we try to degenerate + * the sched_domain + */ + if (!sd_rq) + goto attach; + + /* Get a pointer to the 1st sched_domain */ + sd = sd_rq->sd; /* Remove the sched domains which do not contribute to scheduling. */ for (tmp = sd; tmp; ) { @@ -5658,14 +5680,17 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu) destroy_sched_domain(tmp, cpu); if (sd) sd->child = NULL; + /* update sched_domain_rq */ + sd_rq->sd = sd; } +attach: sched_domain_debug(sd, cpu); rq_attach_root(rq, rd); - tmp = rq->sd; - rcu_assign_pointer(rq->sd, sd); - destroy_sched_domains(tmp, cpu); + tmp_rq = rq->sd_rq; + rcu_assign_pointer(rq->sd_rq, sd_rq); + destroy_sched_domain_rq(tmp_rq, cpu); update_top_cache_domain(cpu); } @@ -5695,12 +5720,14 @@ struct sd_data { }; struct s_data { + struct sched_domain_rq ** __percpu sd_rq; struct sched_domain ** __percpu sd; struct root_domain *rd; }; enum s_alloc { sa_rootdomain, + sa_sd_rq, sa_sd, sa_sd_storage, sa_none, @@ -5935,7 +5962,7 @@ static void init_sched_groups_power(int cpu, struct sched_domain *sd) return; update_group_power(sd, cpu); - atomic_set(&sg->sgp->nr_busy_cpus, sg->group_weight); + atomic_set(&sg->sgp->nr_busy_cpus, 0); } int __weak arch_sd_sibling_asym_packing(void) @@ -6011,6 +6038,8 @@ static void set_domain_attribute(struct sched_domain *sd, static void __sdt_free(const struct cpumask *cpu_map); static int __sdt_alloc(const struct cpumask *cpu_map); +static void __sdrq_free(const struct cpumask *cpu_map, struct s_data *d); +static int __sdrq_alloc(const struct cpumask *cpu_map, struct s_data *d); static void __free_domain_allocs(struct s_data *d, enum s_alloc what, const struct cpumask *cpu_map) @@ -6019,6 +6048,9 @@ static void __free_domain_allocs(struct s_data *d, enum s_alloc what, case sa_rootdomain: if (!atomic_read(&d->rd->refcount)) free_rootdomain(&d->rd->rcu); /* fall through */ + case sa_sd_rq: + __sdrq_free(cpu_map, d); /* fall through */ + free_percpu(d->sd_rq); /* fall through */ case sa_sd: free_percpu(d->sd); /* fall through */ case sa_sd_storage: @@ -6038,9 +6070,14 @@ static enum s_alloc __visit_domain_allocation_hell(struct s_data *d, d->sd = alloc_percpu(struct sched_domain *); if (!d->sd) return sa_sd_storage; + d->sd_rq = alloc_percpu(struct sched_domain_rq *); + if (!d->sd_rq) + return sa_sd; + if (__sdrq_alloc(cpu_map, d)) + return sa_sd_rq; d->rd = alloc_rootdomain(); if (!d->rd) - return sa_sd; + return sa_sd_rq; return sa_rootdomain; } @@ -6466,6 +6503,46 @@ static void __sdt_free(const struct cpumask *cpu_map) } } +static int __sdrq_alloc(const struct cpumask *cpu_map, struct s_data *d) +{ + int j; + + for_each_cpu(j, cpu_map) { + struct sched_domain_rq *sd_rq; + + sd_rq = kzalloc_node(sizeof(struct sched_domain_rq), + GFP_KERNEL, cpu_to_node(j)); + if (!sd_rq) + return -ENOMEM; + + *per_cpu_ptr(d->sd_rq, j) = sd_rq; + } + + return 0; +} + +static void __sdrq_free(const struct cpumask *cpu_map, struct s_data *d) +{ + int j; + + for_each_cpu(j, cpu_map) + if (*per_cpu_ptr(d->sd_rq, j)) + kfree(*per_cpu_ptr(d->sd_rq, j)); +} + +static void build_sched_domain_rq(struct s_data *d, int cpu) +{ + struct sched_domain_rq *sd_rq; + struct sched_domain *sd; + + /* Attach sched_domain to sched_domain_rq */ + sd = *per_cpu_ptr(d->sd, cpu); + sd_rq = *per_cpu_ptr(d->sd_rq, cpu); + sd_rq->sd = sd; + /* Init flags */ + set_bit(NOHZ_IDLE, sched_rq_flags(sd_rq)); +} + struct sched_domain *build_sched_domain(struct sched_domain_topology_level *tl, struct s_data *d, const struct cpumask *cpu_map, struct sched_domain_attr *attr, struct sched_domain *child, @@ -6495,6 +6572,7 @@ static int build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *attr) { enum s_alloc alloc_state = sa_none; + struct sched_domain_rq *sd_rq; struct sched_domain *sd; struct s_data d; int i, ret = -ENOMEM; @@ -6547,11 +6625,18 @@ static int build_sched_domains(const struct cpumask *cpu_map, } } + /* Init objects that must follow the sched_domain lifecycle */ + for_each_cpu(i, cpu_map) { + build_sched_domain_rq(&d, i); + } + /* Attach the domains */ rcu_read_lock(); for_each_cpu(i, cpu_map) { - sd = *per_cpu_ptr(d.sd, i); - cpu_attach_domain(sd, d.rd, i); + sd_rq = *per_cpu_ptr(d.sd_rq, i); + cpu_attach_domain(sd_rq, d.rd, i); + /* claim allocation of sched_domain_rq object */ + *per_cpu_ptr(d.sd_rq, i) = NULL; } rcu_read_unlock(); @@ -6982,7 +7067,7 @@ void __init sched_init(void) rq->last_load_update_tick = jiffies; #ifdef CONFIG_SMP - rq->sd = NULL; + rq->sd_rq = NULL; rq->rd = NULL; rq->cpu_power = SCHED_POWER_SCALE; rq->post_schedule = 0; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7a33e59..1c7447e 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5392,31 +5392,39 @@ static inline void nohz_balance_exit_idle(int cpu) static inline void set_cpu_sd_state_busy(void) { + struct sched_domain_rq *sd_rq; struct sched_domain *sd; int cpu = smp_processor_id(); - if (!test_bit(NOHZ_IDLE, nohz_flags(cpu))) - return; - clear_bit(NOHZ_IDLE, nohz_flags(cpu)); - rcu_read_lock(); - for_each_domain(cpu, sd) + sd_rq = get_sched_domain_rq(cpu); + + if (!sd_rq || !test_bit(NOHZ_IDLE, sched_rq_flags(sd_rq))) + goto unlock; + clear_bit(NOHZ_IDLE, sched_rq_flags(sd_rq)); + + for_each_domain_from_rq(sd_rq, sd) atomic_inc(&sd->groups->sgp->nr_busy_cpus); +unlock: rcu_read_unlock(); } void set_cpu_sd_state_idle(void) { + struct sched_domain_rq *sd_rq; struct sched_domain *sd; int cpu = smp_processor_id(); - if (test_bit(NOHZ_IDLE, nohz_flags(cpu))) - return; - set_bit(NOHZ_IDLE, nohz_flags(cpu)); - rcu_read_lock(); - for_each_domain(cpu, sd) + sd_rq = get_sched_domain_rq(cpu); + + if (!sd_rq || test_bit(NOHZ_IDLE, sched_rq_flags(sd_rq))) + goto unlock; + set_bit(NOHZ_IDLE, sched_rq_flags(sd_rq)); + + for_each_domain_from_rq(sd_rq, sd) atomic_dec(&sd->groups->sgp->nr_busy_cpus); +unlock: rcu_read_unlock(); } @@ -5673,7 +5681,12 @@ static void run_rebalance_domains(struct softirq_action *h) static inline int on_null_domain(int cpu) { - return !rcu_dereference_sched(cpu_rq(cpu)->sd); + struct sched_domain_rq *sd_rq = + rcu_dereference_sched(cpu_rq(cpu)->sd_rq); + struct sched_domain *sd = NULL; + if (sd_rq) + sd = sd_rq->sd; + return !sd; } /* diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index cc03cfd..f589306 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -417,7 +417,7 @@ struct rq { #ifdef CONFIG_SMP struct root_domain *rd; - struct sched_domain *sd; + struct sched_domain_rq *sd_rq; unsigned long cpu_power; @@ -505,21 +505,37 @@ DECLARE_PER_CPU(struct rq, runqueues); #ifdef CONFIG_SMP -#define rcu_dereference_check_sched_domain(p) \ +#define rcu_dereference_check_sched_domain_rq(p) \ rcu_dereference_check((p), \ lockdep_is_held(&sched_domains_mutex)) +#define get_sched_domain_rq(cpu) \ + rcu_dereference_check_sched_domain_rq(cpu_rq(cpu)->sd_rq) + +#define rcu_dereference_check_sched_domain(cpu) ({ \ + struct sched_domain_rq *__sd_rq = get_sched_domain_rq(cpu); \ + struct sched_domain *__sd = NULL; \ + if (__sd_rq) \ + __sd = __sd_rq->sd; \ + __sd; \ +}) + +#define sched_rq_flags(sd_rq) (&sd_rq->flags) + /* - * The domain tree (rq->sd) is protected by RCU's quiescent state transition. + * The domain tree (rq->sd_rq) is protected by RCU's quiescent state transition. * See detach_destroy_domains: synchronize_sched for details. * * The domain tree of any CPU may only be accessed from within * preempt-disabled sections. */ #define for_each_domain(cpu, __sd) \ - for (__sd = rcu_dereference_check_sched_domain(cpu_rq(cpu)->sd); \ + for (__sd = rcu_dereference_check_sched_domain(cpu); \ __sd; __sd = __sd->parent) +#define for_each_domain_from_rq(sd_rq, __sd) \ + for (__sd = sd_rq->sd; __sd; __sd = __sd->parent) + #define for_each_lower_domain(sd) for (; sd; sd = sd->child) /** -- 1.7.9.5

12 years, 1 month

2
9
0 0

[V2 patch 00/19] cpuidle: code consolidation

by Daniel Lezcano

This patchset series provide some code consolidation across the different cpuidle drivers. It contains two parts, the first one is the removal of the time keeping flag and the second one, is a common initialization routine. All the drivers use the en_core_tk_irqen flag, which means it is not necessary to make the time computation optional. We can remove this flag and assume the cpuidle framework always manage this operation. The cpuidle code initialization is duplicated across the different drivers in the same manner. The repeating pattern is: SMP: cpuidle_register_driver(drv); for_each_possible_cpu(cpu) { dev = per_cpu(cpuidle_device, cpu); cpuidle_register_device(dev); } UP: cpuidle_register_driver(drv); cpuidle_register_device(dev); As on a UP machine the macro 'for_each_cpu' is a one iteration loop, using the initialization loop from SMP to UP works. The patchset does some cleanup for different drivers in order to make the init code the same. Then it introduces a generic function: cpuidle_register(struct cpuidle_driver *drv, struct cpumask *cpumask) The cpumask is for the coupled idle states. The drivers are then modified to take into account this new function and to remove the duplicated code. The benefit is observable in the diffstat: 332 lines of code removed. Tested-on: u8500 Tested-on: at91 Tested-on: intel i5 Tested-on: OMAP4 Compiled with and without CPU_IDLE for: u8500, at91, davinci, exynos, imx5, imx6, kirkwood, multi_v7 (for calxeda), omap2plus, s3c64, tegra1, tegra2, tegra3 Daniel Lezcano (19): ARM: shmobile: cpuidle: remove shmobile_enter_wfi function ARM: OMAP3: remove cpuidle_wrap_enter cpuidle: remove en_core_tk_irqen flag ARM: ux500: cpuidle: replace for_each_online_cpu by for_each_possible_cpu ARM: imx: cpuidle: create separate drivers for imx5/imx6 cpuidle: make a single register function for all ARM: ux500: cpuidle: use init/exit common routine ARM: at91: cpuidle: use init/exit common routine ARM: OMAP3: cpuidle: use init/exit common routine ARM: s3c64xx: cpuidle: use init/exit common routine ARM: tegra: cpuidle: use init/exit common routine ARM: shmobile: cpuidle: use init/exit common routine ARM: OMAP4: cpuidle: use init/exit common routine ARM: tegra: cpuidle: use init/exit common routine for tegra2 ARM: tegra: cpuidle: use init/exit common routine for tegra3 ARM: calxeda: cpuidle: use init/exit common routine ARM: kirkwood: cpuidle: use init/exit common routine ARM: davinci: cpuidle: use init/exit common routine ARM: imx: cpuidle: use init/exit common routine Documentation/cpuidle/driver.txt | 6 + arch/arm/mach-at91/cpuidle.c | 18 +-- arch/arm/mach-davinci/cpuidle.c | 21 +--- arch/arm/mach-exynos/cpuidle.c | 1 - arch/arm/mach-imx/Makefile | 1 + arch/arm/mach-imx/cpuidle-imx5.c | 40 +++++++ arch/arm/mach-imx/cpuidle-imx6q.c | 3 +- arch/arm/mach-imx/cpuidle.c | 80 ------------- arch/arm/mach-imx/cpuidle.h | 10 +- arch/arm/mach-imx/pm-imx5.c | 30 +---- arch/arm/mach-omap2/cpuidle34xx.c | 49 ++------ arch/arm/mach-omap2/cpuidle44xx.c | 23 +--- arch/arm/mach-s3c64xx/cpuidle.c | 15 +-- arch/arm/mach-shmobile/cpuidle.c | 11 +- arch/arm/mach-shmobile/include/mach/common.h | 3 - arch/arm/mach-shmobile/pm-sh7372.c | 2 - arch/arm/mach-tegra/cpuidle-tegra114.c | 27 +---- arch/arm/mach-tegra/cpuidle-tegra20.c | 31 +---- arch/arm/mach-tegra/cpuidle-tegra30.c | 28 +---- arch/arm/mach-ux500/cpuidle.c | 33 +----- arch/powerpc/platforms/pseries/processor_idle.c | 1 - arch/sh/kernel/cpu/shmobile/cpuidle.c | 1 - arch/x86/kernel/apm_32.c | 1 - drivers/acpi/processor_idle.c | 1 - drivers/cpuidle/cpuidle-calxeda.c | 53 +-------- drivers/cpuidle/cpuidle-kirkwood.c | 18 +-- drivers/cpuidle/cpuidle.c | 144 ++++++++++++++--------- drivers/idle/intel_idle.c | 1 - include/linux/cpuidle.h | 20 ++-- 29 files changed, 175 insertions(+), 497 deletions(-) create mode 100644 arch/arm/mach-imx/cpuidle-imx5.c delete mode 100644 arch/arm/mach-imx/cpuidle.c -- 1.7.9.5

12 years, 1 month

2
23
0 0

[PATCH] cpufreq: Call __cpufreq_governor() with correct policy->cpus mask

by Viresh Kumar

__cpufreq_governor() must be called with correct policy->cpus mask. In __cpufreq_remove_dev() we initially clear policy->cpus with cpumask_clear_cpu() and then call __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT). In case governor is doing some per-cpu stuff in EXIT callback, this can create uncertain behavior. Generic governors in drivers/cpufreq/ doesn't do any per-cpu stuff in EXIT callback and so we don't face any issues currently. But its better to keep the code clean, so we don't face any issues in future. Now, we call cpumask_clear_cpu() only when multiple cpus are managed by policy. Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> --- drivers/cpufreq/cpufreq.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index fd97a62..3564947 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1105,7 +1105,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif WARN_ON(lock_policy_rwsem_write(cpu)); cpus = cpumask_weight(data->cpus); - cpumask_clear_cpu(cpu, data->cpus); + + if (cpus > 1) + cpumask_clear_cpu(cpu, data->cpus); unlock_policy_rwsem_write(cpu); if (cpu != data->cpu) { -- 1.7.12.rc2.18.g61b472e

12 years, 1 month

2
1
0 0

Re: [PATCH 9/9] powerpc: cpufreq: move cpufreq driver to drivers/cpufreq

by Viresh Kumar

On 3 April 2013 16:00, Benjamin Herrenschmidt <benh(a)kernel.crashing.org> wrote: > On Wed, 2013-04-03 at 15:00 +0530, Viresh Kumar wrote: >> On 31 March 2013 09:33, Viresh Kumar <viresh.kumar(a)linaro.org> wrote: >> > Benjamin/Paul/Olof, >> > >> > Any comments on this? >> >> Ping!! > > I'm on vacation until end of April. No objection to the patch but > somebody needs to test it. Hi, Can somebody else from powerpc world give it a try? OR @Rafael: Can we get this pushed in linux-next as is and then people would be forced to test it and in case there are any complains, i will fix them or you can revert it?

12 years, 1 month

4
4
0 0

[PATCH v6] sched: fix wrong rq's runnable_avg update with rt tasks

by Vincent Guittot

The current update of the rq's load can be erroneous when RT tasks are involved The update of the load of a rq that becomes idle, is done only if the avg_idle is less than sysctl_sched_migration_cost. If RT tasks and short idle duration alternate, the runnable_avg will not be updated correctly and the time will be accounted as idle time when a CFS task wakes up. A new idle_enter function is called when the next task is the idle function so the elapsed time will be accounted as run time in the load of the rq, whatever the average idle time is. The function update_rq_runnable_avg is removed from idle_balance. When a RT task is scheduled on an idle CPU, the update of the rq's load is not done when the rq exit idle state because CFS's functions are not called. Then, the idle_balance, which is called just before entering the idle function, updates the rq's load and makes the assumption that the elapsed time since the last update, was only running time. As a consequence, the rq's load of a CPU that only runs a periodic RT task, is close to LOAD_AVG_MAX whatever the running duration of the RT task is. A new idle_exit function is called when the prev task is the idle function so the elapsed time will be accounted as idle time in the rq's load. Changes since V5: - Rename idle_enter/exit function to idle_enter/exit_fair Changes since V4: - Rebase on v3.9-rc6 instead of Steven Rostedt's patches - Create the post_schedule_idle function that was previously created by Steven's patches Changes since V3: - Remove dependancy with CONFIG_FAIR_GROUP_SCHED - Add a new idle_enter function and create a post_schedule callback for idle class - Remove the update_runnable_avg from idle_balance Changes since V2: - remove useless definition for UP platform - rebased on top of Steven Rostedt's patches : https://lkml.org/lkml/2013/2/12/558 Changes since V1: - move code out of schedule function and create a pre_schedule callback for idle class instead. Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org> --- kernel/sched/fair.c | 23 +++++++++++++++++++++-- kernel/sched/idle_task.c | 16 ++++++++++++++++ kernel/sched/sched.h | 12 ++++++++++++ 3 files changed, 49 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7a33e59..1de3df0 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1562,6 +1562,27 @@ static inline void dequeue_entity_load_avg(struct cfs_rq *cfs_rq, se->avg.decay_count = atomic64_read(&cfs_rq->decay_counter); } /* migrations, e.g. sleep=0 leave decay_count == 0 */ } + +/* + * Update the rq's load with the elapsed running time before entering + * idle. if the last scheduled task is not a CFS task, idle_enter will + * be the only way to update the runnable statistic. + */ +void idle_enter_fair(struct rq *this_rq) +{ + update_rq_runnable_avg(this_rq, 1); +} + +/* + * Update the rq's load with the elapsed idle time before a task is + * scheduled. if the newly scheduled task is not a CFS task, idle_exit will + * be the only way to update the runnable statistic. + */ +void idle_exit_fair(struct rq *this_rq) +{ + update_rq_runnable_avg(this_rq, 0); +} + #else static inline void update_entity_load_avg(struct sched_entity *se, int update_cfs_rq) {} @@ -5219,8 +5240,6 @@ void idle_balance(int this_cpu, struct rq *this_rq) if (this_rq->avg_idle < sysctl_sched_migration_cost) return; - update_rq_runnable_avg(this_rq, 1); - /* * Drop the rq->lock, but keep IRQ/preempt disabled. */ diff --git a/kernel/sched/idle_task.c b/kernel/sched/idle_task.c index b6baf37..b8ce773 100644 --- a/kernel/sched/idle_task.c +++ b/kernel/sched/idle_task.c @@ -13,6 +13,16 @@ select_task_rq_idle(struct task_struct *p, int sd_flag, int flags) { return task_cpu(p); /* IDLE tasks as never migrated */ } + +static void pre_schedule_idle(struct rq *rq, struct task_struct *prev) +{ + idle_exit_fair(rq); +} + +static void post_schedule_idle(struct rq *rq) +{ + idle_enter_fair(rq); +} #endif /* CONFIG_SMP */ /* * Idle tasks are unconditionally rescheduled: @@ -25,6 +35,10 @@ static void check_preempt_curr_idle(struct rq *rq, struct task_struct *p, int fl static struct task_struct *pick_next_task_idle(struct rq *rq) { schedstat_inc(rq, sched_goidle); +#ifdef CONFIG_SMP + /* Trigger the post schedule to do an idle_enter for CFS */ + rq->post_schedule = 1; +#endif return rq->idle; } @@ -86,6 +100,8 @@ const struct sched_class idle_sched_class = { #ifdef CONFIG_SMP .select_task_rq = select_task_rq_idle, + .pre_schedule = pre_schedule_idle, + .post_schedule = post_schedule_idle, #endif .set_curr_task = set_curr_task_idle, diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index cc03cfd..8f1d80e 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -880,6 +880,18 @@ extern const struct sched_class idle_sched_class; extern void trigger_load_balance(struct rq *rq, int cpu); extern void idle_balance(int this_cpu, struct rq *this_rq); +/* + * Only depends on SMP, FAIR_GROUP_SCHED may be removed when runnable_avg + * becomes useful in lb + */ +#if defined(CONFIG_FAIR_GROUP_SCHED) +extern void idle_enter_fair(struct rq *this_rq); +extern void idle_exit_fair(struct rq *this_rq); +#else +static inline void idle_enter_fair(struct rq *this_rq) {} +static inline void idle_exit_fair(struct rq *this_rq) {} +#endif + #else /* CONFIG_SMP */ static inline void idle_balance(int cpu, struct rq *rq) -- 1.7.9.5

12 years, 1 month

1
0
0 0

[PATCH v4] sched: fix wrong rq's runnable_avg update with rt tasks

by Vincent Guittot

The current update of the rq's load can be erroneous when RT tasks are involved The update of the load of a rq that becomes idle, is done only if the avg_idle is less than sysctl_sched_migration_cost. If RT tasks and short idle duration alternate, the runnable_avg will not be updated correctly and the time will be accounted as idle time when a CFS task wakes up. A new idle_enter function is called when the next task is the idle function so the elapsed time will be accounted as run time in the load of the rq, whatever the average idle time is. The function update_rq_runnable_avg is removed from idle_balance. When a RT task is scheduled on an idle CPU, the update of the rq's load is not done when the rq exit idle state because CFS's functions are not called. Then, the idle_balance, which is called just before entering the idle function, updates the rq's load and makes the assumption that the elapsed time since the last update, was only running time. As a consequence, the rq's load of a CPU that only runs a periodic RT task, is close to LOAD_AVG_MAX whatever the running duration of the RT task is. A new idle_exit function is called when the prev task is the idle function so the elapsed time will be accounted as idle time in the rq's load. Changes since V3: - Remove dependancy with CONFIG_FAIR_GROUP_SCHED - Add a new idle_enter function and create a post_schedule callback for idle class - Remove the update_runnable_avg from idle_balance Changes since V2: - remove useless definition for UP platform - rebased on top of Steven Rostedt's patches : https://lkml.org/lkml/2013/2/12/558 Changes since V1: - move code out of schedule function and create a pre_schedule callback for idle class instead. Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org> --- kernel/sched/fair.c | 23 +++++++++++++++++++++-- kernel/sched/idle_task.c | 10 ++++++++++ kernel/sched/sched.h | 12 ++++++++++++ 3 files changed, 43 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 0fcdbff..1851ca8 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1562,6 +1562,27 @@ static inline void dequeue_entity_load_avg(struct cfs_rq *cfs_rq, se->avg.decay_count = atomic64_read(&cfs_rq->decay_counter); } /* migrations, e.g. sleep=0 leave decay_count == 0 */ } + +/* + * Update the rq's load with the elapsed running time before entering + * idle. if the last scheduled task is not a CFS task, idle_enter will + * be the only way to update the runnable statistic. + */ +void idle_enter(struct rq *this_rq) +{ + update_rq_runnable_avg(this_rq, 1); +} + +/* + * Update the rq's load with the elapsed idle time before a task is + * scheduled. if the newly scheduled task is not a CFS task, idle_exit will + * be the only way to update the runnable statistic. + */ +void idle_exit(struct rq *this_rq) +{ + update_rq_runnable_avg(this_rq, 0); +} + #else static inline void update_entity_load_avg(struct sched_entity *se, int update_cfs_rq) {} @@ -5219,8 +5240,6 @@ void idle_balance(int this_cpu, struct rq *this_rq) if (this_rq->avg_idle < sysctl_sched_migration_cost) return; - update_rq_runnable_avg(this_rq, 1); - /* * Drop the rq->lock, but keep preempt disabled. */ diff --git a/kernel/sched/idle_task.c b/kernel/sched/idle_task.c index 66b5220..0775261 100644 --- a/kernel/sched/idle_task.c +++ b/kernel/sched/idle_task.c @@ -14,8 +14,17 @@ select_task_rq_idle(struct task_struct *p, int sd_flag, int flags) return task_cpu(p); /* IDLE tasks as never migrated */ } +static void pre_schedule_idle(struct rq *rq, struct task_struct *prev) +{ + /* Update rq's load with elapsed idle time */ + idle_exit(rq); +} + static void post_schedule_idle(struct rq *rq) { + /* Update rq's load with elapsed running time */ + idle_enter(rq); + idle_balance(smp_processor_id(), rq); } #endif /* CONFIG_SMP */ @@ -95,6 +104,7 @@ const struct sched_class idle_sched_class = { #ifdef CONFIG_SMP .select_task_rq = select_task_rq_idle, + .pre_schedule = pre_schedule_idle, .post_schedule = post_schedule_idle, #endif diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index fc88644..ff4b029 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -878,6 +878,18 @@ extern const struct sched_class idle_sched_class; extern void trigger_load_balance(struct rq *rq, int cpu); extern void idle_balance(int this_cpu, struct rq *this_rq); +/* + * Only depends on SMP, FAIR_GROUP_SCHED may be removed when runnable_avg + * becomes useful in lb + */ +#if defined(CONFIG_FAIR_GROUP_SCHED) +extern void idle_enter(struct rq *this_rq); +extern void idle_exit(struct rq *this_rq); +#else +static inline void idle_enter(struct rq *this_rq) {} +static inline void idle_exit(struct rq *this_rq) {} +#endif + #else /* CONFIG_SMP */ static inline void idle_balance(int cpu, struct rq *rq) -- 1.7.9.5

12 years, 1 month

3
8
0 0

[PATCH v5] sched: fix wrong rq's runnable_avg update with rt tasks

by Vincent Guittot

The current update of the rq's load can be erroneous when RT tasks are involved The update of the load of a rq that becomes idle, is done only if the avg_idle is less than sysctl_sched_migration_cost. If RT tasks and short idle duration alternate, the runnable_avg will not be updated correctly and the time will be accounted as idle time when a CFS task wakes up. A new idle_enter function is called when the next task is the idle function so the elapsed time will be accounted as run time in the load of the rq, whatever the average idle time is. The function update_rq_runnable_avg is removed from idle_balance. When a RT task is scheduled on an idle CPU, the update of the rq's load is not done when the rq exit idle state because CFS's functions are not called. Then, the idle_balance, which is called just before entering the idle function, updates the rq's load and makes the assumption that the elapsed time since the last update, was only running time. As a consequence, the rq's load of a CPU that only runs a periodic RT task, is close to LOAD_AVG_MAX whatever the running duration of the RT task is. A new idle_exit function is called when the prev task is the idle function so the elapsed time will be accounted as idle time in the rq's load. Changes since V4: - Rebase on v3.9-rc6 instead of Steven Rostedt's patches - Create the post_schedule_idle function that was previously created by Steven's patches Changes since V3: - Remove dependancy with CONFIG_FAIR_GROUP_SCHED - Add a new idle_enter function and create a post_schedule callback for idle class - Remove the update_runnable_avg from idle_balance Changes since V2: - remove useless definition for UP platform - rebased on top of Steven Rostedt's patches : https://lkml.org/lkml/2013/2/12/558 Changes since V1: - move code out of schedule function and create a pre_schedule callback for idle class instead. Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org> --- kernel/sched/fair.c | 23 +++++++++++++++++++++-- kernel/sched/idle_task.c | 16 ++++++++++++++++ kernel/sched/sched.h | 12 ++++++++++++ 3 files changed, 49 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7a33e59..653edd8 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1562,6 +1562,27 @@ static inline void dequeue_entity_load_avg(struct cfs_rq *cfs_rq, se->avg.decay_count = atomic64_read(&cfs_rq->decay_counter); } /* migrations, e.g. sleep=0 leave decay_count == 0 */ } + +/* + * Update the rq's load with the elapsed running time before entering + * idle. if the last scheduled task is not a CFS task, idle_enter will + * be the only way to update the runnable statistic. + */ +void idle_enter(struct rq *this_rq) +{ + update_rq_runnable_avg(this_rq, 1); +} + +/* + * Update the rq's load with the elapsed idle time before a task is + * scheduled. if the newly scheduled task is not a CFS task, idle_exit will + * be the only way to update the runnable statistic. + */ +void idle_exit(struct rq *this_rq) +{ + update_rq_runnable_avg(this_rq, 0); +} + #else static inline void update_entity_load_avg(struct sched_entity *se, int update_cfs_rq) {} @@ -5219,8 +5240,6 @@ void idle_balance(int this_cpu, struct rq *this_rq) if (this_rq->avg_idle < sysctl_sched_migration_cost) return; - update_rq_runnable_avg(this_rq, 1); - /* * Drop the rq->lock, but keep IRQ/preempt disabled. */ diff --git a/kernel/sched/idle_task.c b/kernel/sched/idle_task.c index b6baf37..cef61fa 100644 --- a/kernel/sched/idle_task.c +++ b/kernel/sched/idle_task.c @@ -13,6 +13,16 @@ select_task_rq_idle(struct task_struct *p, int sd_flag, int flags) { return task_cpu(p); /* IDLE tasks as never migrated */ } + +static void pre_schedule_idle(struct rq *rq, struct task_struct *prev) +{ + idle_exit(rq); +} + +static void post_schedule_idle(struct rq *rq) +{ + idle_enter(rq); +} #endif /* CONFIG_SMP */ /* * Idle tasks are unconditionally rescheduled: @@ -25,6 +35,10 @@ static void check_preempt_curr_idle(struct rq *rq, struct task_struct *p, int fl static struct task_struct *pick_next_task_idle(struct rq *rq) { schedstat_inc(rq, sched_goidle); +#ifdef CONFIG_SMP + /* Trigger the post schedule to do an idle_enter for CFS */ + rq->post_schedule = 1; +#endif return rq->idle; } @@ -86,6 +100,8 @@ const struct sched_class idle_sched_class = { #ifdef CONFIG_SMP .select_task_rq = select_task_rq_idle, + .pre_schedule = pre_schedule_idle, + .post_schedule = post_schedule_idle, #endif .set_curr_task = set_curr_task_idle, diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index cc03cfd..2b826f2 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -880,6 +880,18 @@ extern const struct sched_class idle_sched_class; extern void trigger_load_balance(struct rq *rq, int cpu); extern void idle_balance(int this_cpu, struct rq *this_rq); +/* + * Only depends on SMP, FAIR_GROUP_SCHED may be removed when runnable_avg + * becomes useful in lb + */ +#if defined(CONFIG_FAIR_GROUP_SCHED) +extern void idle_enter(struct rq *this_rq); +extern void idle_exit(struct rq *this_rq); +#else +static inline void idle_enter(struct rq *this_rq) {} +static inline void idle_exit(struct rq *this_rq) {} +#endif + #else /* CONFIG_SMP */ static inline void idle_balance(int cpu, struct rq *rq) -- 1.7.9.5

12 years, 1 month

1
0
0 0

bl_image symbol in big.LITTLE switcher code

by Prashant B

Hi, I was going through the b.L switcher code. I found a call to enter_nonsecure_world() with parameter "bl_image", obviously it must be address of function that initializes switcher functionality. But I couldn't find any other reference to this symbol in the switcher code. Can somebody please explain this? Thanks. -Prashant

12 years, 2 months

2
2
0 0

[ACTIVITY] 2013-03-30 - 2013-04-05

by David Long

=== David Long === === Highlights === * Responded to QA requests for input on testing requirements for uprobes and kprobes. * Did some coming up to speed on systemtap. * Still working on a clean way to disentangle uprobe and kprobe code without unnecessary duplication. === Plans === * Restructure code * Start building systemtap === Issues === * Apparently we have a complaint from a TSC member that Kprobes does not work, yet v3.8 passes the kernel-built-in tests and when exercised manually krpobes seem to work. We need more specific information about the problems seen. === Travel/Time Off === -dl

12 years, 2 months

2
1
0 0

[PATCH v11 0/3] Add DRM FIMD DT support for Exynos4 DT Machines

by Vikas Sajjan

This patch series adds support for DRM FIMD DT for Exynos4 DT Machines, specifically for Exynos4412 SoC. changes since v10: - addressed comments from Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com> changes since v9: - dropped the patch "ARM: dts: Add lcd pinctrl node entries for EXYNOS4412 SoC" as the gpios in the newly added nodes "lcd_en" and "lcd_sync" in this patch were already PULLed high by existing "lcd_clk" node. - addressed comments from Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com> and Thomas Abraham <thomas.abraham(a)linaro.org> changes since v8: - addressed comments to add missing documentation for clock and clock-names properties as pointed out by Sachin Kamat <sachin.kamat(a)linaro.org> changes since v7: - rebased to kgene's "for-next" - Migrated to Common Clock Framework - removed the patch "ARM: dts: Add FIMD AUXDATA node entry for exynos4 DT", as migration to Common Clock Framework will NOT need this. - addressed the comments raised by Sachin Kamat <sachin.kamat(a)linaro.org> changes since v6: - addressed comments and added interrupt-names = "fifo", "vsync", "lcd_sys" in exynos4.dtsi and re-ordered the interrupt numbering to match the order in interrupt combiner IP as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>. changes since v5: - renamed the fimd binding documentation file name as "samsung-fimd.txt", since it not only talks about exynos display controller but also about previous samsung display controllers. - rephrased an abmigious statement about the interrupt combiner in the fimd binding documentation as pointed out by Sachin Kamat <sachin.kamat(a)linaro.org> changes since v4: - moved the fimd binding documentation to Documentation/devicetree/bindings/video/ as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com> - added more fimd compatiblity strings in fimd documentation as discussed at https://patchwork.kernel.org/patch/2144861/ with Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com> and Tomasz Figa <tomasz.figa(a)gmail.com> - modified compatible string for exynos4 fimd as "exynos4210-fimd" exynos5 fimd as "exynos5250-fimd" to stick to the rule that compatible value should be named after first specific SoC model in which this particular IP version was included as discussed at https://patchwork.kernel.org/patch/2144861/ - documented more about the interrupt combiner and their order as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com> changes since v3: - rebased on http://git.kernel.org/?p=linux/kernel/git/kgene/linux-samsung.git;a=shortlo… changes since v2: - added alias to 'fimd@11c00000' node (reported by: Rahul Sharma <r.sh.open(a)gmail.com>) - removed 'lcd0_data' node as there was already a similar node lcd_data24 (reported by: Jingoo Han <jg1.han(a)samsung.com> - replaced spaces with tabs in display-timing node changes since v1: - added new patch to add FIMD DT binding Documentation - removed patch enabling SAMSUNG_DEV_BACKLIGHT and SAMSUNG_DEV_PMW for mach-exynos4 DT - added 'status' property to fimd DT node Is based on branch kgene's "for-next" https://git.kernel.org/cgit/linux/kernel/git/kgene/linux-samsung.git/log/?h… Vikas Sajjan (3): ARM: dts: Add FIMD node to exynos4 ARM: dts: Add FIMD node and display timing node to exynos4412-origen.dts ARM: dts: Add FIMD DT binding Documentation .../devicetree/bindings/video/samsung-fimd.txt | 65 ++++++++++++++++++++ arch/arm/boot/dts/exynos4.dtsi | 12 ++++ arch/arm/boot/dts/exynos4412-origen.dts | 21 +++++++ 3 files changed, 98 insertions(+) create mode 100644 Documentation/devicetree/bindings/video/samsung-fimd.txt -- 1.7.9.5

12 years, 2 months

2
4
0 0

[PATCH v3 0/2] dma-buf: Add support for debugfs

by Sumit Semwal

The patch series adds a much-missed support for debugfs to dma-buf framework. Based on the feedback received on v1 of this patch series, support is also added to allow exporters to provide name-strings that will prove useful while debugging. Some more magic can be added for more advanced debugging, but we'll leave that for the time being. Best regards, ~Sumit. --- changes since v2: (based on review comments from Laurent Pinchart) - reordered functions to avoid forward declaration - added __exitcall for dma_buf_deinit() changes since v1: - added patch to replace dma_buf_export() with dma_buf_export_named(), per suggestion from Daniel Vetter. - fixes on init and warnings as reported and corrected by Dave Airlie. - added locking while walking attachment list - reported by Daniel Vetter. Sumit Semwal (2): dma-buf: replace dma_buf_export() with dma_buf_export_named() dma-buf: Add debugfs support Documentation/dma-buf-sharing.txt | 13 ++- drivers/base/dma-buf.c | 170 ++++++++++++++++++++++++++++++++++++- include/linux/dma-buf.h | 16 +++- 3 files changed, 190 insertions(+), 9 deletions(-) -- 1.7.10.4

12 years, 2 months

1
3
0 0

[ACTIVITY] (John Stultz) April 1-5

by John Stultz

=== Highlights === * Implemented file backed volatile ranges ontop of Minchan's anonymous volatile ranges patch, sent out to lkml for comments. * Had some discussions with Minchan on the issues regarding the various potential interfaces and how they might mix. * Had some discussions with community folks on how to expose sched_clock timestamps to userland for perf, and how I think its a bad idea. * Tglx pulled in the first half of my 3.10 patch queue * Reviewed Serban's binder patches a few times before he sent them to lkml/arve * Worked with Viresh to resolve a quirk with merge_config.sh, sent the patch upstream. * Reviewed blueprints and sent out Android Upstreaming status email. * Got pinged by some stranger, Zach Pfeffub(or something like that) on tips for upstreaming a hardware driver from AOSP * Did an interview. * Got a chromebook & started to play around with it a bit. === Plans === * Continue focus on volatile range work in prep for lsf-mm * Still need to work on earlysuspend blog post * Double around on timekeeping lock hold-time patch queue * Likely more discussion on perf/sched_clock() interfaces === Issues === * Oh man. This cold just doesn't go away! This week was a foggy blur (how is it already friday!?).

12 years, 2 months

1
0
0 0

[PATCH] sched: core: Fix spelling error inside comment

by Viresh Kumar

sched_domains_numa_distance is written as sched_domains_nume_distance inside one of the comments. Fix it. Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> --- kernel/sched/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 286066e..2e0de12 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6251,7 +6251,7 @@ static void sched_init_numa(void) * 'level' contains the number of unique distances, excluding the * identity distance node_distance(i,i). * - * The sched_domains_nume_distance[] array includes the actual distance + * The sched_domains_numa_distance[] array includes the actual distance * numbers. */ -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

1
0
0 0

[PATCH] cpufreq: Raname index as data in cpufreq_frequency_table

by Viresh Kumar

"Index" field of struct cpufreq_frequency_table was never index and isn't used at all by cpufreq core. And is only useful for cpufreq drivers for their personal use. Many people now a days blindly set it in ascending order with the assumption that core is using it for some work. This patch renames it to "data" as that's what its purpose it. All users of the same are fixed too. Cc: Sekhar Nori <nsekhar(a)ti.com> Cc: Ben Dooks <ben-linux(a)fluff.org> Cc: Kukjin Kim <kgene.kim(a)samsung.com> Cc: Simon Horman <horms(a)verge.net.au> Cc: Magnus Damm <magnus.damm(a)gmail.com> Cc: Ralf Baechle <ralf(a)linux-mips.org> Cc: Arnd Bergmann <arnd(a)arndb.de> Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org> Cc: Paul Mackerras <paulus(a)samba.org> Cc: Olof Johansson <olof(a)lixom.net> Cc: Stephen Warren <swarren(a)wwwdotorg.org> Cc: Srinidhi Kasagar <srinidhi.kasagar(a)stericsson.com> Cc: Linus Walleij <linus.walleij(a)linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> --- This might not apply cleanly as it depends on a lot of work already being done. And so is pushed here: https://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/… Documentation/cpu-freq/cpu-drivers.txt | 10 +- arch/arm/mach-davinci/da850.c | 8 +- arch/arm/mach-s3c24xx/pll-s3c2410.c | 54 +++++----- arch/arm/mach-s3c24xx/pll-s3c2440-12000000.c | 54 +++++----- arch/arm/mach-s3c24xx/pll-s3c2440-16934400.c | 110 ++++++++++----------- arch/arm/mach-shmobile/clock-sh7372.c | 6 +- arch/arm/plat-samsung/include/plat/cpu-freq-core.h | 2 +- arch/mips/loongson/lemote-2f/clock.c | 2 +- arch/powerpc/platforms/cell/cbe_cpufreq.c | 4 +- drivers/base/power/opp.c | 4 +- drivers/cpufreq/acpi-cpufreq.c | 6 +- drivers/cpufreq/blackfin-cpufreq.c | 10 +- drivers/cpufreq/e_powersaver.c | 8 +- drivers/cpufreq/freq_table.c | 26 ++--- drivers/cpufreq/ia64-acpi-cpufreq.c | 2 +- drivers/cpufreq/imx-cpufreq.c | 4 +- drivers/cpufreq/kirkwood-cpufreq.c | 2 +- drivers/cpufreq/longhaul.c | 16 +-- drivers/cpufreq/loongson2_cpufreq.c | 2 +- drivers/cpufreq/p4-clockmod.c | 4 +- drivers/cpufreq/pasemi-cpufreq.c | 4 +- drivers/cpufreq/powernow-k6.c | 8 +- drivers/cpufreq/powernow-k7.c | 16 +-- drivers/cpufreq/powernow-k8.c | 18 ++-- drivers/cpufreq/pxa2xx-cpufreq.c | 8 +- drivers/cpufreq/pxa3xx-cpufreq.c | 4 +- drivers/cpufreq/s3c2416-cpufreq.c | 2 +- drivers/cpufreq/s3c64xx-cpufreq.c | 2 +- drivers/cpufreq/sc520_freq.c | 2 +- drivers/cpufreq/sparc-us2e-cpufreq.c | 12 +-- drivers/cpufreq/sparc-us3-cpufreq.c | 8 +- drivers/cpufreq/spear-cpufreq.c | 4 +- drivers/cpufreq/speedstep-centrino.c | 8 +- drivers/cpufreq/tegra-cpufreq.c | 2 +- drivers/mfd/db8500-prcmu.c | 10 +- drivers/sh/clk/core.c | 4 +- include/linux/cpufreq.h | 2 +- 37 files changed, 221 insertions(+), 227 deletions(-) diff --git a/Documentation/cpu-freq/cpu-drivers.txt b/Documentation/cpu-freq/cpu-drivers.txt index a3585ea..ead563a 100644 --- a/Documentation/cpu-freq/cpu-drivers.txt +++ b/Documentation/cpu-freq/cpu-drivers.txt @@ -186,7 +186,7 @@ As most cpufreq processors only allow for being set to a few specific frequencies, a "frequency table" with some functions might assist in some work of the processor driver. Such a "frequency table" consists of an array of struct cpufreq_frequency_table entries, with any value in -"index" you want to use, and the corresponding frequency in +"data" you want to use, and the corresponding frequency in "frequency". At the end of the table, you need to add a cpufreq_frequency_table entry with frequency set to CPUFREQ_TABLE_END. And if you want to skip one entry in the table, set the frequency to @@ -214,10 +214,4 @@ int cpufreq_frequency_table_target(struct cpufreq_policy *policy, is the corresponding frequency table helper for the ->target stage. Just pass the values to this function, and the unsigned int index returns the number of the frequency table entry which contains -the frequency the CPU shall be set to. PLEASE NOTE: This is not the -"index" which is in this cpufreq_table_entry.index, but instead -cpufreq_table[index]. So, the new frequency is -cpufreq_table[index].frequency, and the value you stored into the -frequency table "index" field is -cpufreq_table[index].index. - +the frequency the CPU shall be set to. diff --git a/arch/arm/mach-davinci/da850.c b/arch/arm/mach-davinci/da850.c index 0c4a26d..5b17f57 100644 --- a/arch/arm/mach-davinci/da850.c +++ b/arch/arm/mach-davinci/da850.c @@ -958,7 +958,7 @@ static const struct da850_opp da850_opp_96 = { #define OPP(freq) \ { \ - .index = (unsigned int) &da850_opp_##freq, \ + .data = (unsigned int) &da850_opp_##freq, \ .frequency = freq * 1000, \ } @@ -970,7 +970,7 @@ static struct cpufreq_frequency_table da850_freq_table[] = { OPP(200), OPP(96), { - .index = 0, + .data = 0, .frequency = CPUFREQ_TABLE_END, }, }; @@ -998,7 +998,7 @@ static int da850_set_voltage(unsigned int index) if (!cvdd) return -ENODEV; - opp = (struct da850_opp *) cpufreq_info.freq_table[index].index; + opp = (struct da850_opp *) cpufreq_info.freq_table[index].data; return regulator_set_voltage(cvdd, opp->cvdd_min, opp->cvdd_max); } @@ -1079,7 +1079,7 @@ static int da850_set_pll0rate(struct clk *clk, unsigned long index) struct pll_data *pll = clk->pll_data; int ret; - opp = (struct da850_opp *) cpufreq_info.freq_table[index].index; + opp = (struct da850_opp *) cpufreq_info.freq_table[index].data; prediv = opp->prediv; mult = opp->mult; postdiv = opp->postdiv; diff --git a/arch/arm/mach-s3c24xx/pll-s3c2410.c b/arch/arm/mach-s3c24xx/pll-s3c2410.c index dcf3420..5051f74 100644 --- a/arch/arm/mach-s3c24xx/pll-s3c2410.c +++ b/arch/arm/mach-s3c24xx/pll-s3c2410.c @@ -33,36 +33,36 @@ #include <plat/cpu-freq-core.h> static struct cpufreq_frequency_table pll_vals_12MHz[] = { - { .frequency = 34000000, .index = PLLVAL(82, 2, 3), }, - { .frequency = 45000000, .index = PLLVAL(82, 1, 3), }, - { .frequency = 51000000, .index = PLLVAL(161, 3, 3), }, - { .frequency = 48000000, .index = PLLVAL(120, 2, 3), }, - { .frequency = 56000000, .index = PLLVAL(142, 2, 3), }, - { .frequency = 68000000, .index = PLLVAL(82, 2, 2), }, - { .frequency = 79000000, .index = PLLVAL(71, 1, 2), }, - { .frequency = 85000000, .index = PLLVAL(105, 2, 2), }, - { .frequency = 90000000, .index = PLLVAL(112, 2, 2), }, - { .frequency = 101000000, .index = PLLVAL(127, 2, 2), }, - { .frequency = 113000000, .index = PLLVAL(105, 1, 2), }, - { .frequency = 118000000, .index = PLLVAL(150, 2, 2), }, - { .frequency = 124000000, .index = PLLVAL(116, 1, 2), }, - { .frequency = 135000000, .index = PLLVAL(82, 2, 1), }, - { .frequency = 147000000, .index = PLLVAL(90, 2, 1), }, - { .frequency = 152000000, .index = PLLVAL(68, 1, 1), }, - { .frequency = 158000000, .index = PLLVAL(71, 1, 1), }, - { .frequency = 170000000, .index = PLLVAL(77, 1, 1), }, - { .frequency = 180000000, .index = PLLVAL(82, 1, 1), }, - { .frequency = 186000000, .index = PLLVAL(85, 1, 1), }, - { .frequency = 192000000, .index = PLLVAL(88, 1, 1), }, - { .frequency = 203000000, .index = PLLVAL(161, 3, 1), }, + { .frequency = 34000000, .data = PLLVAL(82, 2, 3), }, + { .frequency = 45000000, .data = PLLVAL(82, 1, 3), }, + { .frequency = 51000000, .data = PLLVAL(161, 3, 3), }, + { .frequency = 48000000, .data = PLLVAL(120, 2, 3), }, + { .frequency = 56000000, .data = PLLVAL(142, 2, 3), }, + { .frequency = 68000000, .data = PLLVAL(82, 2, 2), }, + { .frequency = 79000000, .data = PLLVAL(71, 1, 2), }, + { .frequency = 85000000, .data = PLLVAL(105, 2, 2), }, + { .frequency = 90000000, .data = PLLVAL(112, 2, 2), }, + { .frequency = 101000000, .data = PLLVAL(127, 2, 2), }, + { .frequency = 113000000, .data = PLLVAL(105, 1, 2), }, + { .frequency = 118000000, .data = PLLVAL(150, 2, 2), }, + { .frequency = 124000000, .data = PLLVAL(116, 1, 2), }, + { .frequency = 135000000, .data = PLLVAL(82, 2, 1), }, + { .frequency = 147000000, .data = PLLVAL(90, 2, 1), }, + { .frequency = 152000000, .data = PLLVAL(68, 1, 1), }, + { .frequency = 158000000, .data = PLLVAL(71, 1, 1), }, + { .frequency = 170000000, .data = PLLVAL(77, 1, 1), }, + { .frequency = 180000000, .data = PLLVAL(82, 1, 1), }, + { .frequency = 186000000, .data = PLLVAL(85, 1, 1), }, + { .frequency = 192000000, .data = PLLVAL(88, 1, 1), }, + { .frequency = 203000000, .data = PLLVAL(161, 3, 1), }, /* 2410A extras */ - { .frequency = 210000000, .index = PLLVAL(132, 2, 1), }, - { .frequency = 226000000, .index = PLLVAL(105, 1, 1), }, - { .frequency = 266000000, .index = PLLVAL(125, 1, 1), }, - { .frequency = 268000000, .index = PLLVAL(126, 1, 1), }, - { .frequency = 270000000, .index = PLLVAL(127, 1, 1), }, + { .frequency = 210000000, .data = PLLVAL(132, 2, 1), }, + { .frequency = 226000000, .data = PLLVAL(105, 1, 1), }, + { .frequency = 266000000, .data = PLLVAL(125, 1, 1), }, + { .frequency = 268000000, .data = PLLVAL(126, 1, 1), }, + { .frequency = 270000000, .data = PLLVAL(127, 1, 1), }, }; static int s3c2410_plls_add(struct device *dev, struct subsys_interface *sif) diff --git a/arch/arm/mach-s3c24xx/pll-s3c2440-12000000.c b/arch/arm/mach-s3c24xx/pll-s3c2440-12000000.c index 6737817..63f86e7 100644 --- a/arch/arm/mach-s3c24xx/pll-s3c2440-12000000.c +++ b/arch/arm/mach-s3c24xx/pll-s3c2440-12000000.c @@ -21,33 +21,33 @@ #include <plat/cpu-freq-core.h> static struct cpufreq_frequency_table s3c2440_plls_12[] __initdata = { - { .frequency = 75000000, .index = PLLVAL(0x75, 3, 3), }, /* FVco 600.000000 */ - { .frequency = 80000000, .index = PLLVAL(0x98, 4, 3), }, /* FVco 640.000000 */ - { .frequency = 90000000, .index = PLLVAL(0x70, 2, 3), }, /* FVco 720.000000 */ - { .frequency = 100000000, .index = PLLVAL(0x5c, 1, 3), }, /* FVco 800.000000 */ - { .frequency = 110000000, .index = PLLVAL(0x66, 1, 3), }, /* FVco 880.000000 */ - { .frequency = 120000000, .index = PLLVAL(0x70, 1, 3), }, /* FVco 960.000000 */ - { .frequency = 150000000, .index = PLLVAL(0x75, 3, 2), }, /* FVco 600.000000 */ - { .frequency = 160000000, .index = PLLVAL(0x98, 4, 2), }, /* FVco 640.000000 */ - { .frequency = 170000000, .index = PLLVAL(0x4d, 1, 2), }, /* FVco 680.000000 */ - { .frequency = 180000000, .index = PLLVAL(0x70, 2, 2), }, /* FVco 720.000000 */ - { .frequency = 190000000, .index = PLLVAL(0x57, 1, 2), }, /* FVco 760.000000 */ - { .frequency = 200000000, .index = PLLVAL(0x5c, 1, 2), }, /* FVco 800.000000 */ - { .frequency = 210000000, .index = PLLVAL(0x84, 2, 2), }, /* FVco 840.000000 */ - { .frequency = 220000000, .index = PLLVAL(0x66, 1, 2), }, /* FVco 880.000000 */ - { .frequency = 230000000, .index = PLLVAL(0x6b, 1, 2), }, /* FVco 920.000000 */ - { .frequency = 240000000, .index = PLLVAL(0x70, 1, 2), }, /* FVco 960.000000 */ - { .frequency = 300000000, .index = PLLVAL(0x75, 3, 1), }, /* FVco 600.000000 */ - { .frequency = 310000000, .index = PLLVAL(0x93, 4, 1), }, /* FVco 620.000000 */ - { .frequency = 320000000, .index = PLLVAL(0x98, 4, 1), }, /* FVco 640.000000 */ - { .frequency = 330000000, .index = PLLVAL(0x66, 2, 1), }, /* FVco 660.000000 */ - { .frequency = 340000000, .index = PLLVAL(0x4d, 1, 1), }, /* FVco 680.000000 */ - { .frequency = 350000000, .index = PLLVAL(0xa7, 4, 1), }, /* FVco 700.000000 */ - { .frequency = 360000000, .index = PLLVAL(0x70, 2, 1), }, /* FVco 720.000000 */ - { .frequency = 370000000, .index = PLLVAL(0xb1, 4, 1), }, /* FVco 740.000000 */ - { .frequency = 380000000, .index = PLLVAL(0x57, 1, 1), }, /* FVco 760.000000 */ - { .frequency = 390000000, .index = PLLVAL(0x7a, 2, 1), }, /* FVco 780.000000 */ - { .frequency = 400000000, .index = PLLVAL(0x5c, 1, 1), }, /* FVco 800.000000 */ + { .frequency = 75000000, .data = PLLVAL(0x75, 3, 3), }, /* FVco 600.000000 */ + { .frequency = 80000000, .data = PLLVAL(0x98, 4, 3), }, /* FVco 640.000000 */ + { .frequency = 90000000, .data = PLLVAL(0x70, 2, 3), }, /* FVco 720.000000 */ + { .frequency = 100000000, .data = PLLVAL(0x5c, 1, 3), }, /* FVco 800.000000 */ + { .frequency = 110000000, .data = PLLVAL(0x66, 1, 3), }, /* FVco 880.000000 */ + { .frequency = 120000000, .data = PLLVAL(0x70, 1, 3), }, /* FVco 960.000000 */ + { .frequency = 150000000, .data = PLLVAL(0x75, 3, 2), }, /* FVco 600.000000 */ + { .frequency = 160000000, .data = PLLVAL(0x98, 4, 2), }, /* FVco 640.000000 */ + { .frequency = 170000000, .data = PLLVAL(0x4d, 1, 2), }, /* FVco 680.000000 */ + { .frequency = 180000000, .data = PLLVAL(0x70, 2, 2), }, /* FVco 720.000000 */ + { .frequency = 190000000, .data = PLLVAL(0x57, 1, 2), }, /* FVco 760.000000 */ + { .frequency = 200000000, .data = PLLVAL(0x5c, 1, 2), }, /* FVco 800.000000 */ + { .frequency = 210000000, .data = PLLVAL(0x84, 2, 2), }, /* FVco 840.000000 */ + { .frequency = 220000000, .data = PLLVAL(0x66, 1, 2), }, /* FVco 880.000000 */ + { .frequency = 230000000, .data = PLLVAL(0x6b, 1, 2), }, /* FVco 920.000000 */ + { .frequency = 240000000, .data = PLLVAL(0x70, 1, 2), }, /* FVco 960.000000 */ + { .frequency = 300000000, .data = PLLVAL(0x75, 3, 1), }, /* FVco 600.000000 */ + { .frequency = 310000000, .data = PLLVAL(0x93, 4, 1), }, /* FVco 620.000000 */ + { .frequency = 320000000, .data = PLLVAL(0x98, 4, 1), }, /* FVco 640.000000 */ + { .frequency = 330000000, .data = PLLVAL(0x66, 2, 1), }, /* FVco 660.000000 */ + { .frequency = 340000000, .data = PLLVAL(0x4d, 1, 1), }, /* FVco 680.000000 */ + { .frequency = 350000000, .data = PLLVAL(0xa7, 4, 1), }, /* FVco 700.000000 */ + { .frequency = 360000000, .data = PLLVAL(0x70, 2, 1), }, /* FVco 720.000000 */ + { .frequency = 370000000, .data = PLLVAL(0xb1, 4, 1), }, /* FVco 740.000000 */ + { .frequency = 380000000, .data = PLLVAL(0x57, 1, 1), }, /* FVco 760.000000 */ + { .frequency = 390000000, .data = PLLVAL(0x7a, 2, 1), }, /* FVco 780.000000 */ + { .frequency = 400000000, .data = PLLVAL(0x5c, 1, 1), }, /* FVco 800.000000 */ }; static int s3c2440_plls12_add(struct device *dev, struct subsys_interface *sif) diff --git a/arch/arm/mach-s3c24xx/pll-s3c2440-16934400.c b/arch/arm/mach-s3c24xx/pll-s3c2440-16934400.c index debfa10..7786b2e 100644 --- a/arch/arm/mach-s3c24xx/pll-s3c2440-16934400.c +++ b/arch/arm/mach-s3c24xx/pll-s3c2440-16934400.c @@ -21,61 +21,61 @@ #include <plat/cpu-freq-core.h> static struct cpufreq_frequency_table s3c2440_plls_169344[] __initdata = { - { .frequency = 78019200, .index = PLLVAL(121, 5, 3), }, /* FVco 624.153600 */ - { .frequency = 84067200, .index = PLLVAL(131, 5, 3), }, /* FVco 672.537600 */ - { .frequency = 90115200, .index = PLLVAL(141, 5, 3), }, /* FVco 720.921600 */ - { .frequency = 96163200, .index = PLLVAL(151, 5, 3), }, /* FVco 769.305600 */ - { .frequency = 102135600, .index = PLLVAL(185, 6, 3), }, /* FVco 817.084800 */ - { .frequency = 108259200, .index = PLLVAL(171, 5, 3), }, /* FVco 866.073600 */ - { .frequency = 114307200, .index = PLLVAL(127, 3, 3), }, /* FVco 914.457600 */ - { .frequency = 120234240, .index = PLLVAL(134, 3, 3), }, /* FVco 961.873920 */ - { .frequency = 126161280, .index = PLLVAL(141, 3, 3), }, /* FVco 1009.290240 */ - { .frequency = 132088320, .index = PLLVAL(148, 3, 3), }, /* FVco 1056.706560 */ - { .frequency = 138015360, .index = PLLVAL(155, 3, 3), }, /* FVco 1104.122880 */ - { .frequency = 144789120, .index = PLLVAL(163, 3, 3), }, /* FVco 1158.312960 */ - { .frequency = 150100363, .index = PLLVAL(187, 9, 2), }, /* FVco 600.401454 */ - { .frequency = 156038400, .index = PLLVAL(121, 5, 2), }, /* FVco 624.153600 */ - { .frequency = 162086400, .index = PLLVAL(126, 5, 2), }, /* FVco 648.345600 */ - { .frequency = 168134400, .index = PLLVAL(131, 5, 2), }, /* FVco 672.537600 */ - { .frequency = 174048000, .index = PLLVAL(177, 7, 2), }, /* FVco 696.192000 */ - { .frequency = 180230400, .index = PLLVAL(141, 5, 2), }, /* FVco 720.921600 */ - { .frequency = 186278400, .index = PLLVAL(124, 4, 2), }, /* FVco 745.113600 */ - { .frequency = 192326400, .index = PLLVAL(151, 5, 2), }, /* FVco 769.305600 */ - { .frequency = 198132480, .index = PLLVAL(109, 3, 2), }, /* FVco 792.529920 */ - { .frequency = 204271200, .index = PLLVAL(185, 6, 2), }, /* FVco 817.084800 */ - { .frequency = 210268800, .index = PLLVAL(141, 4, 2), }, /* FVco 841.075200 */ - { .frequency = 216518400, .index = PLLVAL(171, 5, 2), }, /* FVco 866.073600 */ - { .frequency = 222264000, .index = PLLVAL(97, 2, 2), }, /* FVco 889.056000 */ - { .frequency = 228614400, .index = PLLVAL(127, 3, 2), }, /* FVco 914.457600 */ - { .frequency = 234259200, .index = PLLVAL(158, 4, 2), }, /* FVco 937.036800 */ - { .frequency = 240468480, .index = PLLVAL(134, 3, 2), }, /* FVco 961.873920 */ - { .frequency = 246960000, .index = PLLVAL(167, 4, 2), }, /* FVco 987.840000 */ - { .frequency = 252322560, .index = PLLVAL(141, 3, 2), }, /* FVco 1009.290240 */ - { .frequency = 258249600, .index = PLLVAL(114, 2, 2), }, /* FVco 1032.998400 */ - { .frequency = 264176640, .index = PLLVAL(148, 3, 2), }, /* FVco 1056.706560 */ - { .frequency = 270950400, .index = PLLVAL(120, 2, 2), }, /* FVco 1083.801600 */ - { .frequency = 276030720, .index = PLLVAL(155, 3, 2), }, /* FVco 1104.122880 */ - { .frequency = 282240000, .index = PLLVAL(92, 1, 2), }, /* FVco 1128.960000 */ - { .frequency = 289578240, .index = PLLVAL(163, 3, 2), }, /* FVco 1158.312960 */ - { .frequency = 294235200, .index = PLLVAL(131, 2, 2), }, /* FVco 1176.940800 */ - { .frequency = 300200727, .index = PLLVAL(187, 9, 1), }, /* FVco 600.401454 */ - { .frequency = 306358690, .index = PLLVAL(191, 9, 1), }, /* FVco 612.717380 */ - { .frequency = 312076800, .index = PLLVAL(121, 5, 1), }, /* FVco 624.153600 */ - { .frequency = 318366720, .index = PLLVAL(86, 3, 1), }, /* FVco 636.733440 */ - { .frequency = 324172800, .index = PLLVAL(126, 5, 1), }, /* FVco 648.345600 */ - { .frequency = 330220800, .index = PLLVAL(109, 4, 1), }, /* FVco 660.441600 */ - { .frequency = 336268800, .index = PLLVAL(131, 5, 1), }, /* FVco 672.537600 */ - { .frequency = 342074880, .index = PLLVAL(93, 3, 1), }, /* FVco 684.149760 */ - { .frequency = 348096000, .index = PLLVAL(177, 7, 1), }, /* FVco 696.192000 */ - { .frequency = 355622400, .index = PLLVAL(118, 4, 1), }, /* FVco 711.244800 */ - { .frequency = 360460800, .index = PLLVAL(141, 5, 1), }, /* FVco 720.921600 */ - { .frequency = 366206400, .index = PLLVAL(165, 6, 1), }, /* FVco 732.412800 */ - { .frequency = 372556800, .index = PLLVAL(124, 4, 1), }, /* FVco 745.113600 */ - { .frequency = 378201600, .index = PLLVAL(126, 4, 1), }, /* FVco 756.403200 */ - { .frequency = 384652800, .index = PLLVAL(151, 5, 1), }, /* FVco 769.305600 */ - { .frequency = 391608000, .index = PLLVAL(177, 6, 1), }, /* FVco 783.216000 */ - { .frequency = 396264960, .index = PLLVAL(109, 3, 1), }, /* FVco 792.529920 */ - { .frequency = 402192000, .index = PLLVAL(87, 2, 1), }, /* FVco 804.384000 */ + { .frequency = 78019200, .data = PLLVAL(121, 5, 3), }, /* FVco 624.153600 */ + { .frequency = 84067200, .data = PLLVAL(131, 5, 3), }, /* FVco 672.537600 */ + { .frequency = 90115200, .data = PLLVAL(141, 5, 3), }, /* FVco 720.921600 */ + { .frequency = 96163200, .data = PLLVAL(151, 5, 3), }, /* FVco 769.305600 */ + { .frequency = 102135600, .data = PLLVAL(185, 6, 3), }, /* FVco 817.084800 */ + { .frequency = 108259200, .data = PLLVAL(171, 5, 3), }, /* FVco 866.073600 */ + { .frequency = 114307200, .data = PLLVAL(127, 3, 3), }, /* FVco 914.457600 */ + { .frequency = 120234240, .data = PLLVAL(134, 3, 3), }, /* FVco 961.873920 */ + { .frequency = 126161280, .data = PLLVAL(141, 3, 3), }, /* FVco 1009.290240 */ + { .frequency = 132088320, .data = PLLVAL(148, 3, 3), }, /* FVco 1056.706560 */ + { .frequency = 138015360, .data = PLLVAL(155, 3, 3), }, /* FVco 1104.122880 */ + { .frequency = 144789120, .data = PLLVAL(163, 3, 3), }, /* FVco 1158.312960 */ + { .frequency = 150100363, .data = PLLVAL(187, 9, 2), }, /* FVco 600.401454 */ + { .frequency = 156038400, .data = PLLVAL(121, 5, 2), }, /* FVco 624.153600 */ + { .frequency = 162086400, .data = PLLVAL(126, 5, 2), }, /* FVco 648.345600 */ + { .frequency = 168134400, .data = PLLVAL(131, 5, 2), }, /* FVco 672.537600 */ + { .frequency = 174048000, .data = PLLVAL(177, 7, 2), }, /* FVco 696.192000 */ + { .frequency = 180230400, .data = PLLVAL(141, 5, 2), }, /* FVco 720.921600 */ + { .frequency = 186278400, .data = PLLVAL(124, 4, 2), }, /* FVco 745.113600 */ + { .frequency = 192326400, .data = PLLVAL(151, 5, 2), }, /* FVco 769.305600 */ + { .frequency = 198132480, .data = PLLVAL(109, 3, 2), }, /* FVco 792.529920 */ + { .frequency = 204271200, .data = PLLVAL(185, 6, 2), }, /* FVco 817.084800 */ + { .frequency = 210268800, .data = PLLVAL(141, 4, 2), }, /* FVco 841.075200 */ + { .frequency = 216518400, .data = PLLVAL(171, 5, 2), }, /* FVco 866.073600 */ + { .frequency = 222264000, .data = PLLVAL(97, 2, 2), }, /* FVco 889.056000 */ + { .frequency = 228614400, .data = PLLVAL(127, 3, 2), }, /* FVco 914.457600 */ + { .frequency = 234259200, .data = PLLVAL(158, 4, 2), }, /* FVco 937.036800 */ + { .frequency = 240468480, .data = PLLVAL(134, 3, 2), }, /* FVco 961.873920 */ + { .frequency = 246960000, .data = PLLVAL(167, 4, 2), }, /* FVco 987.840000 */ + { .frequency = 252322560, .data = PLLVAL(141, 3, 2), }, /* FVco 1009.290240 */ + { .frequency = 258249600, .data = PLLVAL(114, 2, 2), }, /* FVco 1032.998400 */ + { .frequency = 264176640, .data = PLLVAL(148, 3, 2), }, /* FVco 1056.706560 */ + { .frequency = 270950400, .data = PLLVAL(120, 2, 2), }, /* FVco 1083.801600 */ + { .frequency = 276030720, .data = PLLVAL(155, 3, 2), }, /* FVco 1104.122880 */ + { .frequency = 282240000, .data = PLLVAL(92, 1, 2), }, /* FVco 1128.960000 */ + { .frequency = 289578240, .data = PLLVAL(163, 3, 2), }, /* FVco 1158.312960 */ + { .frequency = 294235200, .data = PLLVAL(131, 2, 2), }, /* FVco 1176.940800 */ + { .frequency = 300200727, .data = PLLVAL(187, 9, 1), }, /* FVco 600.401454 */ + { .frequency = 306358690, .data = PLLVAL(191, 9, 1), }, /* FVco 612.717380 */ + { .frequency = 312076800, .data = PLLVAL(121, 5, 1), }, /* FVco 624.153600 */ + { .frequency = 318366720, .data = PLLVAL(86, 3, 1), }, /* FVco 636.733440 */ + { .frequency = 324172800, .data = PLLVAL(126, 5, 1), }, /* FVco 648.345600 */ + { .frequency = 330220800, .data = PLLVAL(109, 4, 1), }, /* FVco 660.441600 */ + { .frequency = 336268800, .data = PLLVAL(131, 5, 1), }, /* FVco 672.537600 */ + { .frequency = 342074880, .data = PLLVAL(93, 3, 1), }, /* FVco 684.149760 */ + { .frequency = 348096000, .data = PLLVAL(177, 7, 1), }, /* FVco 696.192000 */ + { .frequency = 355622400, .data = PLLVAL(118, 4, 1), }, /* FVco 711.244800 */ + { .frequency = 360460800, .data = PLLVAL(141, 5, 1), }, /* FVco 720.921600 */ + { .frequency = 366206400, .data = PLLVAL(165, 6, 1), }, /* FVco 732.412800 */ + { .frequency = 372556800, .data = PLLVAL(124, 4, 1), }, /* FVco 745.113600 */ + { .frequency = 378201600, .data = PLLVAL(126, 4, 1), }, /* FVco 756.403200 */ + { .frequency = 384652800, .data = PLLVAL(151, 5, 1), }, /* FVco 769.305600 */ + { .frequency = 391608000, .data = PLLVAL(177, 6, 1), }, /* FVco 783.216000 */ + { .frequency = 396264960, .data = PLLVAL(109, 3, 1), }, /* FVco 792.529920 */ + { .frequency = 402192000, .data = PLLVAL(87, 2, 1), }, /* FVco 804.384000 */ }; static int s3c2440_plls169344_add(struct device *dev, diff --git a/arch/arm/mach-shmobile/clock-sh7372.c b/arch/arm/mach-shmobile/clock-sh7372.c index 45d21fe..1981c6d 100644 --- a/arch/arm/mach-shmobile/clock-sh7372.c +++ b/arch/arm/mach-shmobile/clock-sh7372.c @@ -171,15 +171,15 @@ static void pllc2_table_rebuild(struct clk *clk) /* Initialise PLLC2 frequency table */ for (i = 0; i < ARRAY_SIZE(pllc2_freq_table) - 2; i++) { pllc2_freq_table[i].frequency = clk->parent->rate * (i + 20) * 2; - pllc2_freq_table[i].index = i; + pllc2_freq_table[i].data = i; } /* This is a special entry - switching PLL off makes it a repeater */ pllc2_freq_table[i].frequency = clk->parent->rate; - pllc2_freq_table[i].index = i; + pllc2_freq_table[i].data = i; pllc2_freq_table[++i].frequency = CPUFREQ_TABLE_END; - pllc2_freq_table[i].index = i; + pllc2_freq_table[i].data = i; } static unsigned long pllc2_recalc(struct clk *clk) diff --git a/arch/arm/plat-samsung/include/plat/cpu-freq-core.h b/arch/arm/plat-samsung/include/plat/cpu-freq-core.h index d7e1715..126fce4 100644 --- a/arch/arm/plat-samsung/include/plat/cpu-freq-core.h +++ b/arch/arm/plat-samsung/include/plat/cpu-freq-core.h @@ -285,7 +285,7 @@ static inline int s3c_cpufreq_addfreq(struct cpufreq_frequency_table *table, s3c_freq_dbg("%s: { %d = %u kHz }\n", __func__, index, freq); - table[index].index = index; + table[index].data = index; table[index].frequency = freq; } diff --git a/arch/mips/loongson/lemote-2f/clock.c b/arch/mips/loongson/lemote-2f/clock.c index bc739d4..06edc17 100644 --- a/arch/mips/loongson/lemote-2f/clock.c +++ b/arch/mips/loongson/lemote-2f/clock.c @@ -121,7 +121,7 @@ int clk_set_rate(struct clk *clk, unsigned long rate) clk->rate = rate; regval = LOONGSON_CHIPCFG0; - regval = (regval & ~0x7) | (loongson2_clockmod_table[i].index - 1); + regval = (regval & ~0x7) | (loongson2_clockmod_table[i].data - 1); LOONGSON_CHIPCFG0 = regval; return ret; diff --git a/arch/powerpc/platforms/cell/cbe_cpufreq.c b/arch/powerpc/platforms/cell/cbe_cpufreq.c index 718c6a3..1d693c0 100644 --- a/arch/powerpc/platforms/cell/cbe_cpufreq.c +++ b/arch/powerpc/platforms/cell/cbe_cpufreq.c @@ -105,7 +105,7 @@ static int cbe_cpufreq_cpu_init(struct cpufreq_policy *policy) /* initialize frequency table */ for (i=0; cbe_freqs[i].frequency!=CPUFREQ_TABLE_END; i++) { - cbe_freqs[i].frequency = max_freq / cbe_freqs[i].index; + cbe_freqs[i].frequency = max_freq / cbe_freqs[i].data; pr_debug("%d: %d\n", i, cbe_freqs[i].frequency); } @@ -164,7 +164,7 @@ static int cbe_cpufreq_target(struct cpufreq_policy *policy, "1/%d of max frequency\n", policy->cpu, cbe_freqs[cbe_pmode_new].frequency, - cbe_freqs[cbe_pmode_new].index); + cbe_freqs[cbe_pmode_new].data); rc = set_pmode(policy->cpu, cbe_pmode_new); diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c index 32ee0fc..08f9fb1 100644 --- a/drivers/base/power/opp.c +++ b/drivers/base/power/opp.c @@ -647,14 +647,14 @@ int opp_init_cpufreq_table(struct device *dev, list_for_each_entry(opp, &dev_opp->opp_list, node) { if (opp->available) { - freq_table[i].index = i; + freq_table[i].data = i; freq_table[i].frequency = opp->rate / 1000; i++; } } mutex_unlock(&dev_opp_list_lock); - freq_table[i].index = i; + freq_table[i].data = i; freq_table[i].frequency = CPUFREQ_TABLE_END; *table = &freq_table[0]; diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c index 11b8b4b..ce28c34 100644 --- a/drivers/cpufreq/acpi-cpufreq.c +++ b/drivers/cpufreq/acpi-cpufreq.c @@ -232,7 +232,7 @@ static unsigned extract_msr(u32 msr, struct acpi_cpufreq_data *data) perf = data->acpi_data; for (i = 0; data->freq_table[i].frequency != CPUFREQ_TABLE_END; i++) { - if (msr == perf->states[data->freq_table[i].index].status) + if (msr == perf->states[data->freq_table[i].data].status) return data->freq_table[i].frequency; } return data->freq_table[0].frequency; @@ -442,7 +442,7 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy, goto out; } - next_perf_state = data->freq_table[next_state].index; + next_perf_state = data->freq_table[next_state].data; if (perf->state == next_perf_state) { if (unlikely(data->resume)) { pr_debug("Called after resume, resetting to P%d\n", @@ -811,7 +811,7 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy) data->freq_table[valid_states-1].frequency / 1000) continue; - data->freq_table[valid_states].index = i; + data->freq_table[valid_states].data = i; data->freq_table[valid_states].frequency = perf->states[i].core_frequency * 1000; valid_states++; diff --git a/drivers/cpufreq/blackfin-cpufreq.c b/drivers/cpufreq/blackfin-cpufreq.c index 995511e80..03857f1 100644 --- a/drivers/cpufreq/blackfin-cpufreq.c +++ b/drivers/cpufreq/blackfin-cpufreq.c @@ -20,23 +20,23 @@ /* this is the table of CCLK frequencies, in Hz */ -/* .index is the entry in the auxiliary dpm_state_table[] */ +/* .data is the entry in the auxiliary dpm_state_table[] */ static struct cpufreq_frequency_table bfin_freq_table[] = { { .frequency = CPUFREQ_TABLE_END, - .index = 0, + .data = 0, }, { .frequency = CPUFREQ_TABLE_END, - .index = 1, + .data = 1, }, { .frequency = CPUFREQ_TABLE_END, - .index = 2, + .data = 2, }, { .frequency = CPUFREQ_TABLE_END, - .index = 0, + .data = 0, }, }; diff --git a/drivers/cpufreq/e_powersaver.c b/drivers/cpufreq/e_powersaver.c index 37380fb..7cc03b7 100644 --- a/drivers/cpufreq/e_powersaver.c +++ b/drivers/cpufreq/e_powersaver.c @@ -188,7 +188,7 @@ static int eps_target(struct cpufreq_policy *policy, } /* Make frequency transition */ - dest_state = centaur->freq_table[newstate].index & 0xffff; + dest_state = centaur->freq_table[newstate].data & 0xffff; ret = eps_set_state(centaur, policy, dest_state); if (ret) printk(KERN_ERR "eps: Timeout!\n"); @@ -380,9 +380,9 @@ static int eps_cpu_init(struct cpufreq_policy *policy) f_table = &centaur->freq_table[0]; if (brand != EPS_BRAND_C7M) { f_table[0].frequency = fsb * min_multiplier; - f_table[0].index = (min_multiplier << 8) | min_voltage; + f_table[0].data = (min_multiplier << 8) | min_voltage; f_table[1].frequency = fsb * max_multiplier; - f_table[1].index = (max_multiplier << 8) | max_voltage; + f_table[1].data = (max_multiplier << 8) | max_voltage; f_table[2].frequency = CPUFREQ_TABLE_END; } else { k = 0; @@ -391,7 +391,7 @@ static int eps_cpu_init(struct cpufreq_policy *policy) for (i = min_multiplier; i <= max_multiplier; i++) { voltage = (k * step) / 256 + min_voltage; f_table[k].frequency = fsb * i; - f_table[k].index = (i << 8) | voltage; + f_table[k].data = (i << 8) | voltage; k++; } f_table[k].frequency = CPUFREQ_TABLE_END; diff --git a/drivers/cpufreq/freq_table.c b/drivers/cpufreq/freq_table.c index d7a7966..c17a605 100644 --- a/drivers/cpufreq/freq_table.c +++ b/drivers/cpufreq/freq_table.c @@ -34,8 +34,8 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy, continue; } - pr_debug("table entry %u: %u kHz, %u index\n", - i, freq, table[i].index); + pr_debug("table entry %u: %u kHz, %u data\n", + i, freq, table[i].data); if (freq < min_freq) min_freq = freq; if (freq > max_freq) @@ -97,11 +97,11 @@ int cpufreq_frequency_table_target(struct cpufreq_policy *policy, unsigned int *index) { struct cpufreq_frequency_table optimal = { - .index = ~0, + .data = ~0, .frequency = 0, }; struct cpufreq_frequency_table suboptimal = { - .index = ~0, + .data = ~0, .frequency = 0, }; unsigned int i; @@ -129,12 +129,12 @@ int cpufreq_frequency_table_target(struct cpufreq_policy *policy, if (freq <= target_freq) { if (freq >= optimal.frequency) { optimal.frequency = freq; - optimal.index = i; + optimal.data = i; } } else { if (freq <= suboptimal.frequency) { suboptimal.frequency = freq; - suboptimal.index = i; + suboptimal.data = i; } } break; @@ -142,26 +142,26 @@ int cpufreq_frequency_table_target(struct cpufreq_policy *policy, if (freq >= target_freq) { if (freq <= optimal.frequency) { optimal.frequency = freq; - optimal.index = i; + optimal.data = i; } } else { if (freq >= suboptimal.frequency) { suboptimal.frequency = freq; - suboptimal.index = i; + suboptimal.data = i; } } break; } } - if (optimal.index > i) { - if (suboptimal.index > i) + if (optimal.data > i) { + if (suboptimal.data > i) return -EINVAL; - *index = suboptimal.index; + *index = suboptimal.data; } else - *index = optimal.index; + *index = optimal.data; pr_debug("target is %u (%u kHz, %u)\n", *index, table[*index].frequency, - table[*index].index); + table[*index].data); return 0; } diff --git a/drivers/cpufreq/ia64-acpi-cpufreq.c b/drivers/cpufreq/ia64-acpi-cpufreq.c index c0075db..b8e8536 100644 --- a/drivers/cpufreq/ia64-acpi-cpufreq.c +++ b/drivers/cpufreq/ia64-acpi-cpufreq.c @@ -326,7 +326,7 @@ acpi_cpufreq_cpu_init ( /* table init */ for (i = 0; i <= data->acpi_data.state_count; i++) { - data->freq_table[i].index = i; + data->freq_table[i].data = i; if (i < data->acpi_data.state_count) { data->freq_table[i].frequency = data->acpi_data.states[i].core_frequency * 1000; diff --git a/drivers/cpufreq/imx-cpufreq.c b/drivers/cpufreq/imx-cpufreq.c index 20250b4..3017118 100644 --- a/drivers/cpufreq/imx-cpufreq.c +++ b/drivers/cpufreq/imx-cpufreq.c @@ -129,7 +129,7 @@ static int mxc_cpufreq_init(struct cpufreq_policy *policy) } for (i = 0; i < cpu_op_nr; i++) { - imx_freq_table[i].index = i; + imx_freq_table[i].data = i; imx_freq_table[i].frequency = cpu_op_tbl[i].cpu_rate / 1000; if ((cpu_op_tbl[i].cpu_rate / 1000) < cpu_freq_khz_min) @@ -139,7 +139,7 @@ static int mxc_cpufreq_init(struct cpufreq_policy *policy) cpu_freq_khz_max = cpu_op_tbl[i].cpu_rate / 1000; } - imx_freq_table[i].index = i; + imx_freq_table[i].data = i; imx_freq_table[i].frequency = CPUFREQ_TABLE_END; policy->cur = clk_get_rate(cpu_clk) / 1000; diff --git a/drivers/cpufreq/kirkwood-cpufreq.c b/drivers/cpufreq/kirkwood-cpufreq.c index d36ea8d..7fdc677 100644 --- a/drivers/cpufreq/kirkwood-cpufreq.c +++ b/drivers/cpufreq/kirkwood-cpufreq.c @@ -59,7 +59,7 @@ static void kirkwood_cpufreq_set_cpu_state(struct cpufreq_policy *policy, unsigned int index) { struct cpufreq_freqs freqs; - unsigned int state = kirkwood_freq_table[index].index; + unsigned int state = kirkwood_freq_table[index].data; unsigned long reg; freqs.old = kirkwood_cpufreq_get_cpu_frequency(0); diff --git a/drivers/cpufreq/longhaul.c b/drivers/cpufreq/longhaul.c index b448638..95e56bd 100644 --- a/drivers/cpufreq/longhaul.c +++ b/drivers/cpufreq/longhaul.c @@ -254,7 +254,7 @@ static void longhaul_setstate(struct cpufreq_policy *policy, u32 bm_timeout = 1000; unsigned int dir = 0; - mults_index = longhaul_table[table_index].index; + mults_index = longhaul_table[table_index].data; /* Safety precautions */ mult = mults[mults_index & 0x1f]; if (mult == -1) @@ -487,7 +487,7 @@ static int __cpuinit longhaul_get_ranges(void) if (ratio > maxmult || ratio < minmult) continue; longhaul_table[k].frequency = calc_speed(ratio); - longhaul_table[k].index = j; + longhaul_table[k].data = j; k++; } if (k <= 1) { @@ -508,8 +508,8 @@ static int __cpuinit longhaul_get_ranges(void) if (min_i != j) { swap(longhaul_table[j].frequency, longhaul_table[min_i].frequency); - swap(longhaul_table[j].index, - longhaul_table[min_i].index); + swap(longhaul_table[j].data, + longhaul_table[min_i].data); } } @@ -517,7 +517,7 @@ static int __cpuinit longhaul_get_ranges(void) /* Find index we are running on */ for (j = 0; j < k; j++) { - if (mults[longhaul_table[j].index & 0x1f] == mult) { + if (mults[longhaul_table[j].data & 0x1f] == mult) { longhaul_index = j; break; } @@ -613,7 +613,7 @@ static void __cpuinit longhaul_setup_voltagescaling(void) pos = (speed - min_vid_speed) / kHz_step + minvid.pos; else pos = minvid.pos; - longhaul_table[j].index |= mV_vrm_table[pos] << 8; + longhaul_table[j].data |= mV_vrm_table[pos] << 8; vid = vrm_mV_table[mV_vrm_table[pos]]; printk(KERN_INFO PFX "f: %d kHz, index: %d, vid: %d mV\n", speed, j, vid.mV); @@ -656,12 +656,12 @@ static int longhaul_target(struct cpufreq_policy *policy, * this in hardware, C3 is old and we need to do this * in software. */ i = longhaul_index; - current_vid = (longhaul_table[longhaul_index].index >> 8); + current_vid = (longhaul_table[longhaul_index].data >> 8); current_vid &= 0x1f; if (table_index > longhaul_index) dir = 1; while (i != table_index) { - vid = (longhaul_table[i].index >> 8) & 0x1f; + vid = (longhaul_table[i].data >> 8) & 0x1f; if (vid != current_vid) { longhaul_setstate(policy, i); current_vid = vid; diff --git a/drivers/cpufreq/loongson2_cpufreq.c b/drivers/cpufreq/loongson2_cpufreq.c index 8488957..d9ae287 100644 --- a/drivers/cpufreq/loongson2_cpufreq.c +++ b/drivers/cpufreq/loongson2_cpufreq.c @@ -71,7 +71,7 @@ static int loongson2_cpufreq_target(struct cpufreq_policy *policy, freq = ((cpu_clock_freq / 1000) * - loongson2_clockmod_table[newstate].index) / 8; + loongson2_clockmod_table[newstate].data) / 8; if (freq < policy->min || freq > policy->max) return -EINVAL; diff --git a/drivers/cpufreq/p4-clockmod.c b/drivers/cpufreq/p4-clockmod.c index 421ef37..6d69901 100644 --- a/drivers/cpufreq/p4-clockmod.c +++ b/drivers/cpufreq/p4-clockmod.c @@ -118,7 +118,7 @@ static int cpufreq_p4_target(struct cpufreq_policy *policy, return -EINVAL; freqs.old = cpufreq_p4_get(policy->cpu); - freqs.new = stock_freq * p4clockmod_table[newstate].index / 8; + freqs.new = stock_freq * p4clockmod_table[newstate].data / 8; if (freqs.new == freqs.old) return 0; @@ -131,7 +131,7 @@ static int cpufreq_p4_target(struct cpufreq_policy *policy, * Developer's Manual, Volume 3 */ for_each_cpu(i, policy->cpus) - cpufreq_p4_setdc(i, p4clockmod_table[newstate].index); + cpufreq_p4_setdc(i, p4clockmod_table[newstate].data); /* notifiers */ cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE); diff --git a/drivers/cpufreq/pasemi-cpufreq.c b/drivers/cpufreq/pasemi-cpufreq.c index be1e795..3786e2b 100644 --- a/drivers/cpufreq/pasemi-cpufreq.c +++ b/drivers/cpufreq/pasemi-cpufreq.c @@ -204,7 +204,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) /* initialize frequency table */ for (i=0; pas_freqs[i].frequency!=CPUFREQ_TABLE_END; i++) { - pas_freqs[i].frequency = get_astate_freq(pas_freqs[i].index) * 100000; + pas_freqs[i].frequency = get_astate_freq(pas_freqs[i].data) * 100000; pr_debug("%d: %d\n", i, pas_freqs[i].frequency); } @@ -280,7 +280,7 @@ static int pas_cpufreq_target(struct cpufreq_policy *policy, pr_debug("setting frequency for cpu %d to %d kHz, 1/%d of max frequency\n", policy->cpu, pas_freqs[pas_astate_new].frequency, - pas_freqs[pas_astate_new].index); + pas_freqs[pas_astate_new].data); current_astate = pas_astate_new; diff --git a/drivers/cpufreq/powernow-k6.c b/drivers/cpufreq/powernow-k6.c index ea0222a..3b9f74d 100644 --- a/drivers/cpufreq/powernow-k6.c +++ b/drivers/cpufreq/powernow-k6.c @@ -58,7 +58,7 @@ static int powernow_k6_get_cpu_multiplier(void) msrval = POWERNOW_IOPORT + 0x0; wrmsr(MSR_K6_EPMR, msrval, 0); /* disable it again */ - return clock_ratio[(invalue >> 5)&7].index; + return clock_ratio[(invalue >> 5)&7].data; } @@ -75,13 +75,13 @@ static void powernow_k6_set_state(struct cpufreq_policy *policy, unsigned long msrval; struct cpufreq_freqs freqs; - if (clock_ratio[best_i].index > max_multiplier) { + if (clock_ratio[best_i].data > max_multiplier) { printk(KERN_ERR PFX "invalid target frequency\n"); return; } freqs.old = busfreq * powernow_k6_get_cpu_multiplier(); - freqs.new = busfreq * clock_ratio[best_i].index; + freqs.new = busfreq * clock_ratio[best_i].data; cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE); @@ -156,7 +156,7 @@ static int powernow_k6_cpu_init(struct cpufreq_policy *policy) /* table init */ for (i = 0; (clock_ratio[i].frequency != CPUFREQ_TABLE_END); i++) { - f = clock_ratio[i].index; + f = clock_ratio[i].data; if (f > max_multiplier) clock_ratio[i].frequency = CPUFREQ_ENTRY_INVALID; else diff --git a/drivers/cpufreq/powernow-k7.c b/drivers/cpufreq/powernow-k7.c index 53888da..cffe828 100644 --- a/drivers/cpufreq/powernow-k7.c +++ b/drivers/cpufreq/powernow-k7.c @@ -186,7 +186,7 @@ static int get_ranges(unsigned char *pst) fid = *pst++; powernow_table[j].frequency = (fsb * fid_codes[fid]) / 10; - powernow_table[j].index = fid; /* lower 8 bits */ + powernow_table[j].data = fid; /* lower 8 bits */ speed = powernow_table[j].frequency; @@ -203,7 +203,7 @@ static int get_ranges(unsigned char *pst) maximum_speed = speed; vid = *pst++; - powernow_table[j].index |= (vid << 8); /* upper 8 bits */ + powernow_table[j].data |= (vid << 8); /* upper 8 bits */ pr_debug(" FID: 0x%x (%d.%dx [%dMHz]) " "VID: 0x%x (%d.%03dV)\n", fid, fid_codes[fid] / 10, @@ -212,7 +212,7 @@ static int get_ranges(unsigned char *pst) mobile_vid_table[vid]%1000); } powernow_table[number_scales].frequency = CPUFREQ_TABLE_END; - powernow_table[number_scales].index = 0; + powernow_table[number_scales].data = 0; return 0; } @@ -260,8 +260,8 @@ static void change_speed(struct cpufreq_policy *policy, unsigned int index) * vid are the upper 8 bits. */ - fid = powernow_table[index].index & 0xFF; - vid = (powernow_table[index].index & 0xFF00) >> 8; + fid = powernow_table[index].data & 0xFF; + vid = (powernow_table[index].data & 0xFF00) >> 8; rdmsrl(MSR_K7_FID_VID_STATUS, fidvidstatus.val); cfid = fidvidstatus.bits.CFID; @@ -373,8 +373,8 @@ static int powernow_acpi_init(void) fid = pc.bits.fid; powernow_table[i].frequency = fsb * fid_codes[fid] / 10; - powernow_table[i].index = fid; /* lower 8 bits */ - powernow_table[i].index |= (vid << 8); /* upper 8 bits */ + powernow_table[i].data = fid; /* lower 8 bits */ + powernow_table[i].data |= (vid << 8); /* upper 8 bits */ speed = powernow_table[i].frequency; speed_mhz = speed / 1000; @@ -417,7 +417,7 @@ static int powernow_acpi_init(void) } powernow_table[i].frequency = CPUFREQ_TABLE_END; - powernow_table[i].index = 0; + powernow_table[i].data = 0; /* notify BIOS that we exist */ acpi_processor_notify_smm(THIS_MODULE); diff --git a/drivers/cpufreq/powernow-k8.c b/drivers/cpufreq/powernow-k8.c index b828efe..6865266 100644 --- a/drivers/cpufreq/powernow-k8.c +++ b/drivers/cpufreq/powernow-k8.c @@ -584,9 +584,9 @@ static void print_basics(struct powernow_k8_data *data) CPUFREQ_ENTRY_INVALID) { printk(KERN_INFO PFX "fid 0x%x (%d MHz), vid 0x%x\n", - data->powernow_table[j].index & 0xff, + data->powernow_table[j].data & 0xff, data->powernow_table[j].frequency/1000, - data->powernow_table[j].index >> 8); + data->powernow_table[j].data >> 8); } } if (data->batps) @@ -632,13 +632,13 @@ static int fill_powernow_table(struct powernow_k8_data *data, for (j = 0; j < data->numps; j++) { int freq; - powernow_table[j].index = pst[j].fid; /* lower 8 bits */ - powernow_table[j].index |= (pst[j].vid << 8); /* upper 8 bits */ + powernow_table[j].data = pst[j].fid; /* lower 8 bits */ + powernow_table[j].data |= (pst[j].vid << 8); /* upper 8 bits */ freq = find_khz_freq_from_fid(pst[j].fid); powernow_table[j].frequency = freq; } powernow_table[data->numps].frequency = CPUFREQ_TABLE_END; - powernow_table[data->numps].index = 0; + powernow_table[data->numps].data = 0; if (query_current_values_with_pending_wait(data)) { kfree(powernow_table); @@ -810,7 +810,7 @@ static int powernow_k8_cpu_init_acpi(struct powernow_k8_data *data) powernow_table[data->acpi_data.state_count].frequency = CPUFREQ_TABLE_END; - powernow_table[data->acpi_data.state_count].index = 0; + powernow_table[data->acpi_data.state_count].data = 0; data->powernow_table = powernow_table; if (cpumask_first(cpu_core_mask(data->cpu)) == data->cpu) @@ -865,7 +865,7 @@ static int fill_powernow_table_fidvid(struct powernow_k8_data *data, pr_debug(" %d : fid 0x%x, vid 0x%x\n", i, fid, vid); index = fid | (vid<<8); - powernow_table[i].index = index; + powernow_table[i].data = index; freq = find_khz_freq_from_fid(fid); powernow_table[i].frequency = freq; @@ -941,8 +941,8 @@ static int transition_frequency_fidvid(struct powernow_k8_data *data, * the cpufreq frequency table in find_psb_table, vid * are the upper 8 bits. */ - fid = data->powernow_table[index].index & 0xFF; - vid = (data->powernow_table[index].index & 0xFF00) >> 8; + fid = data->powernow_table[index].data & 0xFF; + vid = (data->powernow_table[index].data & 0xFF00) >> 8; pr_debug("table matched fid 0x%x, giving vid 0x%x\n", fid, vid); diff --git a/drivers/cpufreq/pxa2xx-cpufreq.c b/drivers/cpufreq/pxa2xx-cpufreq.c index fe4c55b..d2469e7 100644 --- a/drivers/cpufreq/pxa2xx-cpufreq.c +++ b/drivers/cpufreq/pxa2xx-cpufreq.c @@ -419,7 +419,7 @@ static int pxa_cpufreq_init(struct cpufreq_policy *policy) /* Generate pxa25x the run cpufreq_frequency_table struct */ for (i = 0; i < NUM_PXA25x_RUN_FREQS; i++) { pxa255_run_freq_table[i].frequency = pxa255_run_freqs[i].khz; - pxa255_run_freq_table[i].index = i; + pxa255_run_freq_table[i].data = i; } pxa255_run_freq_table[i].frequency = CPUFREQ_TABLE_END; @@ -427,7 +427,7 @@ static int pxa_cpufreq_init(struct cpufreq_policy *policy) for (i = 0; i < NUM_PXA25x_TURBO_FREQS; i++) { pxa255_turbo_freq_table[i].frequency = pxa255_turbo_freqs[i].khz; - pxa255_turbo_freq_table[i].index = i; + pxa255_turbo_freq_table[i].data = i; } pxa255_turbo_freq_table[i].frequency = CPUFREQ_TABLE_END; @@ -439,9 +439,9 @@ static int pxa_cpufreq_init(struct cpufreq_policy *policy) if (freq > pxa27x_maxfreq) break; pxa27x_freq_table[i].frequency = freq; - pxa27x_freq_table[i].index = i; + pxa27x_freq_table[i].data = i; } - pxa27x_freq_table[i].index = i; + pxa27x_freq_table[i].data = i; pxa27x_freq_table[i].frequency = CPUFREQ_TABLE_END; /* diff --git a/drivers/cpufreq/pxa3xx-cpufreq.c b/drivers/cpufreq/pxa3xx-cpufreq.c index 15d60f8..adaa78b 100644 --- a/drivers/cpufreq/pxa3xx-cpufreq.c +++ b/drivers/cpufreq/pxa3xx-cpufreq.c @@ -98,10 +98,10 @@ static int setup_freqs_table(struct cpufreq_policy *policy, return -ENOMEM; for (i = 0; i < num; i++) { - table[i].index = i; + table[i].data = i; table[i].frequency = freqs[i].cpufreq_mhz * 1000; } - table[num].index = i; + table[num].data = i; table[num].frequency = CPUFREQ_TABLE_END; pxa3xx_freqs = freqs; diff --git a/drivers/cpufreq/s3c2416-cpufreq.c b/drivers/cpufreq/s3c2416-cpufreq.c index 4f1881e..0c17fe7 100644 --- a/drivers/cpufreq/s3c2416-cpufreq.c +++ b/drivers/cpufreq/s3c2416-cpufreq.c @@ -244,7 +244,7 @@ static int s3c2416_cpufreq_set_target(struct cpufreq_policy *policy, if (ret != 0) goto out; - idx = s3c_freq->freq_table[i].index; + idx = s3c_freq->freq_table[i].data; if (idx == SOURCE_HCLK) to_dvs = 1; diff --git a/drivers/cpufreq/s3c64xx-cpufreq.c b/drivers/cpufreq/s3c64xx-cpufreq.c index 27cacb5..240d5c8 100644 --- a/drivers/cpufreq/s3c64xx-cpufreq.c +++ b/drivers/cpufreq/s3c64xx-cpufreq.c @@ -87,7 +87,7 @@ static int s3c64xx_cpufreq_set_target(struct cpufreq_policy *policy, freqs.old = clk_get_rate(armclk) / 1000; freqs.new = s3c64xx_freq_table[i].frequency; freqs.flags = 0; - dvfs = &s3c64xx_dvfs_table[s3c64xx_freq_table[i].index]; + dvfs = &s3c64xx_dvfs_table[s3c64xx_freq_table[i].data]; if (freqs.old == freqs.new) return 0; diff --git a/drivers/cpufreq/sc520_freq.c b/drivers/cpufreq/sc520_freq.c index f740b13..edf7b2d 100644 --- a/drivers/cpufreq/sc520_freq.c +++ b/drivers/cpufreq/sc520_freq.c @@ -71,7 +71,7 @@ static void sc520_freq_set_cpu_state(struct cpufreq_policy *policy, local_irq_disable(); clockspeed_reg = *cpuctl & ~0x03; - *cpuctl = clockspeed_reg | sc520_freq_table[state].index; + *cpuctl = clockspeed_reg | sc520_freq_table[state].data; local_irq_enable(); diff --git a/drivers/cpufreq/sparc-us2e-cpufreq.c b/drivers/cpufreq/sparc-us2e-cpufreq.c index 306ae46..216e166 100644 --- a/drivers/cpufreq/sparc-us2e-cpufreq.c +++ b/drivers/cpufreq/sparc-us2e-cpufreq.c @@ -308,17 +308,17 @@ static int __init us2e_freq_cpu_init(struct cpufreq_policy *policy) struct cpufreq_frequency_table *table = &us2e_freq_table[cpu].table[0]; - table[0].index = 0; + table[0].data = 0; table[0].frequency = clock_tick / 1; - table[1].index = 1; + table[1].data = 1; table[1].frequency = clock_tick / 2; - table[2].index = 2; + table[2].data = 2; table[2].frequency = clock_tick / 4; - table[2].index = 3; + table[2].data = 3; table[2].frequency = clock_tick / 6; - table[2].index = 4; + table[2].data = 4; table[2].frequency = clock_tick / 8; - table[2].index = 5; + table[2].data = 5; table[3].frequency = CPUFREQ_TABLE_END; policy->cpuinfo.transition_latency = 0; diff --git a/drivers/cpufreq/sparc-us3-cpufreq.c b/drivers/cpufreq/sparc-us3-cpufreq.c index c71ee14..9889b8e 100644 --- a/drivers/cpufreq/sparc-us3-cpufreq.c +++ b/drivers/cpufreq/sparc-us3-cpufreq.c @@ -169,13 +169,13 @@ static int __init us3_freq_cpu_init(struct cpufreq_policy *policy) struct cpufreq_frequency_table *table = &us3_freq_table[cpu].table[0]; - table[0].index = 0; + table[0].data = 0; table[0].frequency = clock_tick / 1; - table[1].index = 1; + table[1].data = 1; table[1].frequency = clock_tick / 2; - table[2].index = 2; + table[2].data = 2; table[2].frequency = clock_tick / 32; - table[3].index = 0; + table[3].data = 0; table[3].frequency = CPUFREQ_TABLE_END; policy->cpuinfo.transition_latency = 0; diff --git a/drivers/cpufreq/spear-cpufreq.c b/drivers/cpufreq/spear-cpufreq.c index 156829f..ec448bfc 100644 --- a/drivers/cpufreq/spear-cpufreq.c +++ b/drivers/cpufreq/spear-cpufreq.c @@ -250,11 +250,11 @@ static int spear_cpufreq_driver_init(void) } for (i = 0; i < cnt; i++) { - freq_tbl[i].index = i; + freq_tbl[i].data = i; freq_tbl[i].frequency = be32_to_cpup(val++); } - freq_tbl[i].index = i; + freq_tbl[i].data = i; freq_tbl[i].frequency = CPUFREQ_TABLE_END; spear_cpufreq.freq_tbl = freq_tbl; diff --git a/drivers/cpufreq/speedstep-centrino.c b/drivers/cpufreq/speedstep-centrino.c index 618e6f4..fcfa2f8 100644 --- a/drivers/cpufreq/speedstep-centrino.c +++ b/drivers/cpufreq/speedstep-centrino.c @@ -79,11 +79,11 @@ static struct cpufreq_driver centrino_driver; /* Computes the correct form for IA32_PERF_CTL MSR for a particular frequency/voltage operating point; frequency in MHz, volts in mV. - This is stored as "index" in the structure. */ + This is stored as "data" in the structure. */ #define OP(mhz, mv) \ { \ .frequency = (mhz) * 1000, \ - .index = (((mhz)/100) << 8) | ((mv - 700) / 16) \ + .data = (((mhz)/100) << 8) | ((mv - 700) / 16) \ } /* @@ -307,7 +307,7 @@ static unsigned extract_clock(unsigned msr, unsigned int cpu, int failsafe) per_cpu(centrino_model, cpu)->op_points[i].frequency != CPUFREQ_TABLE_END; i++) { - if (msr == per_cpu(centrino_model, cpu)->op_points[i].index) + if (msr == per_cpu(centrino_model, cpu)->op_points[i].data) return per_cpu(centrino_model, cpu)-> op_points[i].frequency; } @@ -501,7 +501,7 @@ static int centrino_target (struct cpufreq_policy *policy, break; } - msr = per_cpu(centrino_model, cpu)->op_points[newstate].index; + msr = per_cpu(centrino_model, cpu)->op_points[newstate].data; if (first_cpu) { rdmsr_on_cpu(good_cpu, MSR_IA32_PERF_CTL, &oldmsr, &h); diff --git a/drivers/cpufreq/tegra-cpufreq.c b/drivers/cpufreq/tegra-cpufreq.c index c74c0e1..fca6184 100644 --- a/drivers/cpufreq/tegra-cpufreq.c +++ b/drivers/cpufreq/tegra-cpufreq.c @@ -28,7 +28,7 @@ #include <linux/io.h> #include <linux/suspend.h> -/* Frequency table index must be sequential starting at 0 */ +/* Frequency table data must be sequential starting at 0 */ static struct cpufreq_frequency_table freq_table[] = { { 0, 216000 }, { 1, 312000 }, diff --git a/drivers/mfd/db8500-prcmu.c b/drivers/mfd/db8500-prcmu.c index 21f261b..e210ff1 100644 --- a/drivers/mfd/db8500-prcmu.c +++ b/drivers/mfd/db8500-prcmu.c @@ -1810,9 +1810,9 @@ static long round_clock_rate(u8 clock, unsigned long rate) /* CPU FREQ table, may be changed due to if MAX_OPP is supported. */ static struct cpufreq_frequency_table db8500_cpufreq_table[] = { - { .frequency = 200000, .index = ARM_EXTCLK,}, - { .frequency = 400000, .index = ARM_50_OPP,}, - { .frequency = 800000, .index = ARM_100_OPP,}, + { .frequency = 200000, .data = ARM_EXTCLK,}, + { .frequency = 400000, .data = ARM_50_OPP,}, + { .frequency = 800000, .data = ARM_100_OPP,}, { .frequency = CPUFREQ_TABLE_END,}, /* To be used for MAX_OPP. */ { .frequency = CPUFREQ_TABLE_END,}, }; @@ -1987,7 +1987,7 @@ static int set_armss_rate(unsigned long rate) return -EINVAL; /* Set the new arm opp. */ - return db8500_prcmu_set_arm_opp(db8500_cpufreq_table[i].index); + return db8500_prcmu_set_arm_opp(db8500_cpufreq_table[i].data); } static int set_plldsi_rate(unsigned long rate) @@ -3137,7 +3137,7 @@ static void db8500_prcmu_update_cpufreq(void) { if (prcmu_has_arm_maxopp()) { db8500_cpufreq_table[3].frequency = 1000000; - db8500_cpufreq_table[3].index = ARM_MAX_OPP; + db8500_cpufreq_table[3].data = ARM_MAX_OPP; } } diff --git a/drivers/sh/clk/core.c b/drivers/sh/clk/core.c index 7715de2..bc48116 100644 --- a/drivers/sh/clk/core.c +++ b/drivers/sh/clk/core.c @@ -63,12 +63,12 @@ void clk_rate_table_build(struct clk *clk, else freq = clk->parent->rate * mult / div; - freq_table[i].index = i; + freq_table[i].data = i; freq_table[i].frequency = freq; } /* Termination entry */ - freq_table[i].index = i; + freq_table[i].data = i; freq_table[i].frequency = CPUFREQ_TABLE_END; } diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 037d36a..aa0c2a3 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -404,7 +404,7 @@ extern struct cpufreq_governor cpufreq_gov_conservative; #define CPUFREQ_TABLE_END ~1 struct cpufreq_frequency_table { - unsigned int index; /* any */ + unsigned int data; /* any value, not used by core */ unsigned int frequency; /* kHz - doesn't need to be in ascending * order */ }; -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

2
7
0 0

[PATCH 0/9] ARM: CPUFreq: Move drivers arch/arm/ -> drivers/cpufreq

by Viresh Kumar

Hi, This patchset tries to move all CPUFreq drivers for ARM platforms from arch/arm to drivers/cpufreq directory. This series is dependent (rebased of) on following patches: http://www.spinics.net/lists/arm-kernel/msg232540.html https://lkml.org/lkml/2013/3/24/151 I want this series to go through Rafael's tree due to dependencies and so want Ack's by respective maintainers. These are applied here for people looking to test them: http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/h… Viresh Kumar (9): cpufreq: ARM: Arrange drivers in alphabetical order cpufreq: tegra: Move driver to drivers/cpufreq cpufreq: davinci: move cpufreq driver to drivers/cpufreq cpufreq: imx: move cpufreq driver to drivers/cpufreq cpufreq: integrator: move cpufreq driver to drivers/cpufreq cpufreq: pxa3xx: move cpufreq driver to drivers/cpufreq cpufreq: pxa2xx: move cpufreq driver to drivers/cpufreq cpufreq: s3c24xx: move cpufreq driver to drivers/cpufreq cpufreq: sa11x0: move cpufreq driver to drivers/cpufreq arch/arm/Kconfig | 79 --------- arch/arm/mach-davinci/Makefile | 1 - arch/arm/mach-imx/Makefile | 1 - arch/arm/mach-imx/mxc.h | 6 +- arch/arm/mach-integrator/Makefile | 1 - arch/arm/mach-pxa/Makefile | 6 - arch/arm/mach-pxa/include/mach/generic.h | 1 + arch/arm/mach-s3c24xx/Kconfig | 66 +++---- arch/arm/mach-s3c24xx/Makefile | 6 - arch/arm/mach-s3c24xx/{ => include/mach}/s3c2412.h | 0 arch/arm/mach-s3c24xx/iotiming-s3c2412.c | 2 +- arch/arm/mach-sa1100/Kconfig | 26 +-- arch/arm/mach-sa1100/Makefile | 3 - arch/arm/mach-sa1100/include/mach/generic.h | 1 + arch/arm/mach-tegra/Makefile | 1 - arch/arm/plat-samsung/include/plat/cpu-freq-core.h | 10 +- arch/arm/plat-samsung/include/plat/cpu-freq.h | 6 +- drivers/cpufreq/Kconfig.arm | 191 +++++++++++++++------ drivers/cpufreq/Makefile | 24 ++- .../cpufreq.c => drivers/cpufreq/davinci-cpufreq.c | 2 - .../cpufreq.c => drivers/cpufreq/imx-cpufreq.c | 7 +- .../cpu.c => drivers/cpufreq/integrator-cpufreq.c | 2 - .../cpufreq/pxa2xx-cpufreq.c | 2 - .../cpufreq/pxa3xx-cpufreq.c | 5 +- .../cpufreq/s3c2410-cpufreq.c | 0 .../cpufreq/s3c2412-cpufreq.c | 3 +- .../cpufreq/s3c2440-cpufreq.c | 0 .../cpufreq/s3c24xx-cpufreq-debugfs.c | 0 .../cpufreq.c => drivers/cpufreq/s3c24xx-cpufreq.c | 0 .../cpufreq/sa1100-cpufreq.c | 3 +- .../cpufreq/sa1110-cpufreq.c | 3 +- .../cpu-tegra.c => drivers/cpufreq/tegra-cpufreq.c | 2 - include/linux/cpufreq/imx.h | 10 ++ 33 files changed, 223 insertions(+), 247 deletions(-) create mode 100644 arch/arm/mach-pxa/include/mach/generic.h rename arch/arm/mach-s3c24xx/{ => include/mach}/s3c2412.h (100%) create mode 100644 arch/arm/mach-sa1100/include/mach/generic.h rename arch/arm/mach-davinci/cpufreq.c => drivers/cpufreq/davinci-cpufreq.c (99%) rename arch/arm/mach-imx/cpufreq.c => drivers/cpufreq/imx-cpufreq.c (99%) rename arch/arm/mach-integrator/cpu.c => drivers/cpufreq/integrator-cpufreq.c (99%) rename arch/arm/mach-pxa/cpufreq-pxa2xx.c => drivers/cpufreq/pxa2xx-cpufreq.c (99%) rename arch/arm/mach-pxa/cpufreq-pxa3xx.c => drivers/cpufreq/pxa3xx-cpufreq.c (98%) rename arch/arm/mach-s3c24xx/cpufreq-s3c2410.c => drivers/cpufreq/s3c2410-cpufreq.c (100%) rename arch/arm/mach-s3c24xx/cpufreq-s3c2412.c => drivers/cpufreq/s3c2412-cpufreq.c (99%) rename arch/arm/mach-s3c24xx/cpufreq-s3c2440.c => drivers/cpufreq/s3c2440-cpufreq.c (100%) rename arch/arm/mach-s3c24xx/cpufreq-debugfs.c => drivers/cpufreq/s3c24xx-cpufreq-debugfs.c (100%) rename arch/arm/mach-s3c24xx/cpufreq.c => drivers/cpufreq/s3c24xx-cpufreq.c (100%) rename arch/arm/mach-sa1100/cpu-sa1100.c => drivers/cpufreq/sa1100-cpufreq.c (99%) rename arch/arm/mach-sa1100/cpu-sa1110.c => drivers/cpufreq/sa1110-cpufreq.c (99%) rename arch/arm/mach-tegra/cpu-tegra.c => drivers/cpufreq/tegra-cpufreq.c (99%) create mode 100644 include/linux/cpufreq/imx.h -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

7
30
0 0

[PATCH v2 0/2] Add devfreq runtime pm support

by Rajagopal Venkat

Patch to bind devfreq to runtime pm framework. Instead of explicitly using devfreq_suspend_device() and devfreq_resume_device() apis for devfreq core suspend/resume, let runtime-pm core handle it automatically. Suspend device devfreq core load monitoring with pm_runtime_suspend() and resume back on pm_runtime_resume(). Discussed at http://comments.gmane.org/gmane.linux.linaro.devel/13787 Changes from v1: - improved change log and code comments - added NULL check for devfreq runtime-pm callbacks ----- Rajagopal Venkat (2): PM / devfreq: Fix compiler warnings PM / devfreq: tie suspend/resume to runtime-pm drivers/base/power/runtime.c | 21 ++++++++++++- drivers/devfreq/devfreq.c | 69 +++++++++++++++++++++++++++++++++++++++--- include/linux/devfreq.h | 30 ++++++------------ include/linux/pm.h | 2 ++ 4 files changed, 97 insertions(+), 25 deletions(-) -- 1.7.10.4

12 years, 2 months

4
7
0 0

[PATCH] Fix LPAE for KVM enabled builds

by Alexander Spyridakis

Hello, With the latest linaro-kernel release (ll_20130321.0), LPAE seems to be broken on Versatile Express (and possible other targets too) as it hangs very early in the boot process when enabled. KVM builds depend on LPAE, so it would be good to see this fixed on next release. Attached you can find said fix. Regards.

12 years, 2 months

2
1
0 0

[PATCH 0/3] Sync Android pstore updates

by Anton Vorontsov

Hi all, Here are a few updates from the Android dev tree. Thanks to Arve Hjønnevåg for the code, and John Stultz for actually preparing commits for submission. Unless there are objections, I'll push these updates to linux-pstore.git. Thanks! Anton

12 years, 2 months

2
5
0 0

[PATCH 0/9] CPUFreq: Move drivers to drivers/cpufreq - Part 2

by Viresh Kumar

Hi, This is the second patchset toward migrating all cpufreq drivers to drivers/cpufreq (Earlier one was for ARM drivers). These are all applied here: http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/h… They aren't tested, not even compilation. Viresh Kumar (9): AVR32: cpufreq: move cpufreq driver to drivers/cpufreq blackfin: cpufreq: move cpufreq driver to drivers/cpufreq cris: cpufreq: move cpufreq driver to drivers/cpufreq ia64: cpufreq: move cpufreq driver to drivers/cpufreq mips: cpufreq: move cpufreq driver to drivers/cpufreq sh: cpufreq: move cpufreq driver to drivers/cpufreq unicore2: cpufreq: move cpufreq driver to drivers/cpufreq spark: cpufreq: move cpufreq driver to drivers/cpufreq powerpc: cpufreq: move cpufreq driver to drivers/cpufreq arch/avr32/Kconfig | 13 ---- arch/avr32/configs/atngw100_defconfig | 2 +- arch/avr32/configs/atngw100_evklcd100_defconfig | 2 +- arch/avr32/configs/atngw100_evklcd101_defconfig | 2 +- arch/avr32/configs/atngw100_mrmt_defconfig | 2 +- arch/avr32/configs/atngw100mkii_defconfig | 2 +- .../avr32/configs/atngw100mkii_evklcd100_defconfig | 2 +- .../avr32/configs/atngw100mkii_evklcd101_defconfig | 2 +- arch/avr32/configs/atstk1002_defconfig | 2 +- arch/avr32/configs/atstk1003_defconfig | 2 +- arch/avr32/configs/atstk1004_defconfig | 2 +- arch/avr32/configs/atstk1006_defconfig | 2 +- arch/avr32/configs/favr-32_defconfig | 2 +- arch/avr32/configs/hammerhead_defconfig | 2 +- arch/avr32/configs/mimc200_defconfig | 2 +- arch/avr32/mach-at32ap/Makefile | 1 - arch/blackfin/mach-common/Makefile | 1 - arch/cris/arch-v32/mach-a3/Makefile | 1 - arch/cris/arch-v32/mach-fs/Makefile | 1 - arch/ia64/Kconfig | 5 +- arch/ia64/kernel/Makefile | 1 - arch/ia64/kernel/cpufreq/Kconfig | 29 ------- arch/ia64/kernel/cpufreq/Makefile | 2 - arch/mips/Kconfig | 9 ++- arch/mips/kernel/Makefile | 2 - arch/mips/kernel/cpufreq/Kconfig | 41 ---------- arch/mips/kernel/cpufreq/Makefile | 5 -- arch/powerpc/platforms/Kconfig | 31 -------- arch/powerpc/platforms/pasemi/Makefile | 1 - arch/powerpc/platforms/powermac/Makefile | 2 - arch/sh/Kconfig | 18 ----- arch/sh/kernel/Makefile | 1 - arch/sparc/Kconfig | 23 ------ arch/sparc/kernel/Makefile | 3 - arch/unicore32/kernel/Makefile | 1 - drivers/cpufreq/Kconfig | 89 ++++++++++++++++++++++ drivers/cpufreq/Kconfig.powerpc | 26 +++++++ drivers/cpufreq/Makefile | 16 ++++ .../cpufreq.c => drivers/cpufreq/at32ap-cpufreq.c | 0 .../cpufreq/blackfin-cpufreq.c | 0 .../cpufreq/cris-artpec3-cpufreq.c | 0 .../cpufreq/cris-etraxfs-cpufreq.c | 0 .../cpufreq/ia64-acpi-cpufreq.c | 1 - .../kernel => drivers}/cpufreq/loongson2_cpufreq.c | 0 .../cpufreq.c => drivers/cpufreq/pasemi-cpufreq.c | 0 .../cpufreq/pmac32-cpufreq.c | 0 .../cpufreq/pmac64-cpufreq.c | 0 .../cpufreq.c => drivers/cpufreq/sh-cpufreq.c | 2 - .../cpufreq/spark-us2e-cpufreq.c | 0 .../cpufreq/spark-us3-cpufreq.c | 0 .../cpufreq/unicore2-cpufreq.c | 2 +- 51 files changed, 156 insertions(+), 199 deletions(-) delete mode 100644 arch/ia64/kernel/cpufreq/Kconfig delete mode 100644 arch/ia64/kernel/cpufreq/Makefile delete mode 100644 arch/mips/kernel/cpufreq/Kconfig delete mode 100644 arch/mips/kernel/cpufreq/Makefile rename arch/avr32/mach-at32ap/cpufreq.c => drivers/cpufreq/at32ap-cpufreq.c (100%) rename arch/blackfin/mach-common/cpufreq.c => drivers/cpufreq/blackfin-cpufreq.c (100%) rename arch/cris/arch-v32/mach-a3/cpufreq.c => drivers/cpufreq/cris-artpec3-cpufreq.c (100%) rename arch/cris/arch-v32/mach-fs/cpufreq.c => drivers/cpufreq/cris-etraxfs-cpufreq.c (100%) rename arch/ia64/kernel/cpufreq/acpi-cpufreq.c => drivers/cpufreq/ia64-acpi-cpufreq.c (99%) rename {arch/mips/kernel => drivers}/cpufreq/loongson2_cpufreq.c (100%) rename arch/powerpc/platforms/pasemi/cpufreq.c => drivers/cpufreq/pasemi-cpufreq.c (100%) rename arch/powerpc/platforms/powermac/cpufreq_32.c => drivers/cpufreq/pmac32-cpufreq.c (100%) rename arch/powerpc/platforms/powermac/cpufreq_64.c => drivers/cpufreq/pmac64-cpufreq.c (100%) rename arch/sh/kernel/cpufreq.c => drivers/cpufreq/sh-cpufreq.c (99%) rename arch/sparc/kernel/us2e_cpufreq.c => drivers/cpufreq/spark-us2e-cpufreq.c (100%) rename arch/sparc/kernel/us3_cpufreq.c => drivers/cpufreq/spark-us3-cpufreq.c (100%) rename arch/unicore32/kernel/cpu-ucv2.c => drivers/cpufreq/unicore2-cpufreq.c (96%) -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

7
31
0 0

Re: [PATCH 8/9] spark: cpufreq: move cpufreq driver to drivers/cpufreq

by Viresh Kumar

On 3 April 2013 22:08, David Miller <davem(a)davemloft.net> wrote: > From: Viresh Kumar <viresh.kumar(a)linaro.org> > Date: Wed, 3 Apr 2013 14:59:44 +0530 > >> On 1 April 2013 10:11, Viresh Kumar <viresh.kumar(a)linaro.org> wrote: >>> On 31 March 2013 22:10, David Miller <davem(a)davemloft.net> wrote: >>>>> On 26 March 2013 09:55, Viresh Kumar <viresh.kumar(a)linaro.org> wrote: >>>>>> From: Viresh Kumar <viresh.kumar(a)linaro.org> >>>>>> Date: Mon, 25 Mar 2013 11:20:23 +0530 >>>>>> Subject: [PATCH] sparc: cpufreq: move cpufreq driver to drivers/cpufreq >>> >>>> Subject line still has the "spark" typo. >>> >>> Your mail was scary, really... HOW can i do it?? >>> >>> And then i saw how you got it wrong. I haven't sent a new mail, so mails subject >>> remains the same... I copied V2 in the same mail.. Check above, subject looks >>> fine :) >> >> Hi David, >> >> I think all pending issues are fixed now... Can i have your Ack please? >> Or maybe more comments :) > > Acked-by: David S. Miller <davem(a)davemloft.net> Adding everybody else in cc.

12 years, 2 months

1
0
0 0

[PATCH 1/9] ARM: cpuidle: remove useless declaration

by Daniel Lezcano

The noop functions code is not necessary because the header file is included in files which are compiled when CONFIG_CPU_IDLE is on. Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org> --- arch/arm/include/asm/cpuidle.h | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/arch/arm/include/asm/cpuidle.h b/arch/arm/include/asm/cpuidle.h index 2fca60a..7367787 100644 --- a/arch/arm/include/asm/cpuidle.h +++ b/arch/arm/include/asm/cpuidle.h @@ -1,13 +1,8 @@ #ifndef __ASM_ARM_CPUIDLE_H #define __ASM_ARM_CPUIDLE_H -#ifdef CONFIG_CPU_IDLE extern int arm_cpuidle_simple_enter(struct cpuidle_device *dev, - struct cpuidle_driver *drv, int index); -#else -static inline int arm_cpuidle_simple_enter(struct cpuidle_device *dev, - struct cpuidle_driver *drv, int index) { return -ENODEV; } -#endif + struct cpuidle_driver *drv, int index); /* Common ARM WFI state */ #define ARM_CPUIDLE_WFI_STATE_PWR(p) {\ -- 1.7.9.5

12 years, 2 months

9
33
0 0

[PATCH 1/9] ARM: cpuidle: remove useless declaration

by Daniel Lezcano

Hi Rafael, I noticed this patchset went to the linux-arm-project in patchwork but it was addressed to you / linux-pm. AFAIU, you rely on patchwork to manage the patches, I was wondering if you could have missed them ? Thanks -- Daniel

12 years, 2 months

2
2
0 0

Re: [PATCH 1/9] AVR32: cpufreq: move cpufreq driver to drivers/cpufreq

by Viresh Kumar

On 3 April 2013 15:12, Hans-Christian Egtvedt <egtvedt(a)samfundet.no> wrote: > Around Wed 03 Apr 2013 14:58:29 +0530 or thereabout, Viresh Kumar wrote: >> Ping!! > > If it still builds, then > > Acked-by: Hans-Christian Egtvedt <egtvedt(a)samfundet.no> > > It only uses global header files, so should be fine. Thanks. Ahh!! I have removed others from my last Ping!! mail (didn't wanted to spam their inbox), but yes they are aware of your Ack now. Thanks.

12 years, 2 months

1
0
0 0

[PATCH v3] memcg: Add memory.pressure_level events

by Anton Vorontsov

With this patch userland applications that want to maintain the interactivity/memory allocation cost can use the pressure level notifications. The levels are defined like this: The "low" level means that the system is reclaiming memory for new allocations. Monitoring this reclaiming activity might be useful for maintaining cache level. Upon notification, the program (typically "Activity Manager") might analyze vmstat and act in advance (i.e. prematurely shutdown unimportant services). The "medium" level means that the system is experiencing medium memory pressure, the system might be making swap, paging out active file caches, etc. Upon this event applications may decide to further analyze vmstat/zoneinfo/memcg or internal memory usage statistics and free any resources that can be easily reconstructed or re-read from a disk. The "critical" level means that the system is actively thrashing, it is about to out of memory (OOM) or even the in-kernel OOM killer is on its way to trigger. Applications should do whatever they can to help the system. It might be too late to consult with vmstat or any other statistics, so it's advisable to take an immediate action. The events are propagated upward until the event is handled, i.e. the events are not pass-through. Here is what this means: for example you have three cgroups: A->B->C. Now you set up an event listener on cgroups A, B and C, and suppose group C experiences some pressure. In this situation, only group C will receive the notification, i.e. groups A and B will not receive it. This is done to avoid excessive "broadcasting" of messages, which disturbs the system and which is especially bad if we are low on memory or thrashing. So, organize the cgroups wisely, or propagate the events manually (or, ask us to implement the pass-through events, explaining why would you need them.) Performance wise, the memory pressure notifications feature itself is lightweight and does not require much of bookkeeping, in contrast to the rest of memcg features. Unfortunately, as of current memcg implementation, pages accounting is an inseparable part and cannot be turned off. The good news is that there are some efforts[1] to improve the situation; plus, implementing the same, fully API-compatible[2] interface for CONFIG_MEMCG=n case (e.g. embedded) is also a viable option, so it will not require any changes on the userland side. [1] http://permalink.gmane.org/gmane.linux.kernel.cgroups/6291 [2] http://lkml.org/lkml/2013/2/21/454 Signed-off-by: Anton Vorontsov <anton.vorontsov(a)linaro.org> Acked-by: Kirill A. Shutemov <kirill(a)shutemov.name> --- Hi all, Here is a shiny new v3! In v3: - No changes in the code, just updated commit message to incorporate the answer to Minchan Kim's comment regarding applicability to embedded use cases in the light of memcg performance overhead, plus gave some references to Glauber Costa's memcg work. - Rebased onto 3.9.0-rc3-next-20130321. In v2: - Addressed Glauber Costa's comments: o Use parent_mem_cgroup() instead of own parent function (also suggested by Kamezawa). This change also affected events distribution logic, so it became more like memory thresholds notifications, i.e. we deliver the event to the cgroup where the event originated, not to the parent cgroup; (This also addreses Kamezawa's remark regarding which cgroup receives which event.) o Register vmpressure cgroup file directly in memcontrol.c. - Addressed Greg Thelen's comments: o Fixed bool/int inconsistency in the code; o Fixed nr_scanned accounting; o Don't use cryptic 's', 'r' abbreviations; get rid of confusing 'window' argument. - Addressed Kamezawa Hiroyuki's comments: o Moved declarations from mm/internal.h into linux/vmpressue.h; o Removed Kconfig symbol. Vmpressure is pretty lightweight (especially comparing to the memcg accounting). If it ever causes any measurable performance effect, we want to fix it, not paper it over with a Kconfig option. :-) o Removed read operation on pressure_level cgroup file. In apps, we only use notifications, we don't need the content of the file, so let's keep things simple for now. Plus this resolves questions like what should we return there when the system is not reclaiming; o Reworded documentation; o Improved comments for vmpressure_prio(). Old changelogs/submissions: v2: http://lkml.org/lkml/2013/2/18/577 v1: http://lkml.org/lkml/2013/2/10/140 mempressure cgroup: http://lkml.org/lkml/2013/1/4/55 Documentation/cgroups/memory.txt | 61 +++++++++- include/linux/vmpressure.h | 47 ++++++++ mm/Makefile | 2 +- mm/memcontrol.c | 28 +++++ mm/vmpressure.c | 252 +++++++++++++++++++++++++++++++++++++++ mm/vmscan.c | 8 ++ 6 files changed, 396 insertions(+), 2 deletions(-) create mode 100644 include/linux/vmpressure.h create mode 100644 mm/vmpressure.c diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index addb1f1..0c004de 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -40,6 +40,7 @@ Features: - soft limit - moving (recharging) account at moving a task is selectable. - usage threshold notifier + - memory pressure notifier - oom-killer disable knob and oom-notifier - Root cgroup has no limit controls. @@ -65,6 +66,7 @@ Brief summary of control files. memory.stat # show various statistics memory.use_hierarchy # set/show hierarchical account enabled memory.force_empty # trigger forced move charge to parent + memory.pressure_level # set memory pressure notifications memory.swappiness # set/show swappiness parameter of vmscan (See sysctl's vm.swappiness) memory.move_charge_at_immigrate # set/show controls of moving charges @@ -778,7 +780,64 @@ At reading, current status of OOM is shown. under_oom 0 or 1 (if 1, the memory cgroup is under OOM, tasks may be stopped.) -11. TODO +11. Memory Pressure + +The pressure level notifications can be used to monitor the memory +allocation cost; based on the pressure, applications can implement +different strategies of managing their memory resources. The pressure +levels are defined as following: + +The "low" level means that the system is reclaiming memory for new +allocations. Monitoring this reclaiming activity might be useful for +maintaining cache level. Upon notification, the program (typically +"Activity Manager") might analyze vmstat and act in advance (i.e. +prematurely shutdown unimportant services). + +The "medium" level means that the system is experiencing medium memory +pressure, the system might be making swap, paging out active file caches, +etc. Upon this event applications may decide to further analyze +vmstat/zoneinfo/memcg or internal memory usage statistics and free any +resources that can be easily reconstructed or re-read from a disk. + +The "critical" level means that the system is actively thrashing, it is +about to out of memory (OOM) or even the in-kernel OOM killer is on its +way to trigger. Applications should do whatever they can to help the +system. It might be too late to consult with vmstat or any other +statistics, so it's advisable to take an immediate action. + +The events are propagated upward until the event is handled, i.e. the +events are not pass-through. Here is what this means: for example you have +three cgroups: A->B->C. Now you set up an event listener on cgroups A, B +and C, and suppose group C experiences some pressure. In this situation, +only group C will receive the notification, i.e. groups A and B will not +receive it. This is done to avoid excessive "broadcasting" of messages, +which disturbs the system and which is especially bad if we are low on +memory or thrashing. So, organize the cgroups wisely, or propagate the +events manually (or, ask us to implement the pass-through events, +explaining why would you need them.) + +The file memory.pressure_level is only used to setup an eventfd, +read/write operations are no implemented. + +Test: + + Here is a small script example that makes a new cgroup, sets up a + memory limit, sets up a notification in the cgroup and then makes child + cgroup experience a critical pressure: + + # cd /sys/fs/cgroup/memory/ + # mkdir foo + # cd foo + # cgroup_event_listener memory.pressure_level low & + # echo 8000000 > memory.limit_in_bytes + # echo 8000000 > memory.memsw.limit_in_bytes + # echo $$ > tasks + # dd if=/dev/zero | read x + + (Expect a bunch of notifications, and eventually, the oom-killer will + trigger.) + +12. TODO 1. Add support for accounting huge pages (as a separate controller) 2. Make per-cgroup scanner reclaim not-shared pages first diff --git a/include/linux/vmpressure.h b/include/linux/vmpressure.h new file mode 100644 index 0000000..fa84783 --- /dev/null +++ b/include/linux/vmpressure.h @@ -0,0 +1,47 @@ +#ifndef __LINUX_VMPRESSURE_H +#define __LINUX_VMPRESSURE_H + +#include <linux/mutex.h> +#include <linux/list.h> +#include <linux/workqueue.h> +#include <linux/gfp.h> +#include <linux/types.h> +#include <linux/cgroup.h> + +struct vmpressure { + unsigned int scanned; + unsigned int reclaimed; + /* The lock is used to keep the scanned/reclaimed above in sync. */ + struct mutex sr_lock; + + struct list_head events; + /* Have to grab the lock on events traversal or modifications. */ + struct mutex events_lock; + + struct work_struct work; +}; + +struct mem_cgroup; + +#ifdef CONFIG_MEMCG +extern void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, + unsigned long scanned, unsigned long reclaimed); +extern void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio); +#else +static inline void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, + unsigned long scanned, unsigned long reclaimed) {} +static inline void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, + int prio) {} +#endif /* CONFIG_MEMCG */ + +extern void vmpressure_init(struct vmpressure *vmpr); +extern struct vmpressure *memcg_to_vmpr(struct mem_cgroup *memcg); +extern struct cgroup_subsys_state *vmpr_to_css(struct vmpressure *vmpr); +extern struct vmpressure *css_to_vmpr(struct cgroup_subsys_state *css); +extern int vmpressure_register_event(struct cgroup *cg, struct cftype *cft, + struct eventfd_ctx *eventfd, + const char *args); +extern void vmpressure_unregister_event(struct cgroup *cg, struct cftype *cft, + struct eventfd_ctx *eventfd); + +#endif /* __LINUX_VMPRESSURE_H */ diff --git a/mm/Makefile b/mm/Makefile index 3a46287..72c5acb 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -50,7 +50,7 @@ obj-$(CONFIG_FS_XIP) += filemap_xip.o obj-$(CONFIG_MIGRATION) += migrate.o obj-$(CONFIG_QUICKLIST) += quicklist.o obj-$(CONFIG_TRANSPARENT_HUGEPAGE) += huge_memory.o -obj-$(CONFIG_MEMCG) += memcontrol.o page_cgroup.o +obj-$(CONFIG_MEMCG) += memcontrol.o page_cgroup.o vmpressure.o obj-$(CONFIG_CGROUP_HUGETLB) += hugetlb_cgroup.o obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f608546..2482f2c 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -49,6 +49,7 @@ #include <linux/fs.h> #include <linux/seq_file.h> #include <linux/vmalloc.h> +#include <linux/vmpressure.h> #include <linux/mm_inline.h> #include <linux/page_cgroup.h> #include <linux/cpu.h> @@ -376,6 +377,9 @@ struct mem_cgroup { atomic_t numainfo_events; atomic_t numainfo_updating; #endif + + struct vmpressure vmpr; + /* * Per cgroup active and inactive list, similar to the * per zone LRU lists. @@ -576,6 +580,24 @@ struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *s) return container_of(s, struct mem_cgroup, css); } +/* Some nice accessors for the vmpressure. */ +struct vmpressure *memcg_to_vmpr(struct mem_cgroup *memcg) +{ + if (!memcg) + memcg = root_mem_cgroup; + return &memcg->vmpr; +} + +struct cgroup_subsys_state *vmpr_to_css(struct vmpressure *vmpr) +{ + return &container_of(vmpr, struct mem_cgroup, vmpr)->css; +} + +struct vmpressure *css_to_vmpr(struct cgroup_subsys_state *css) +{ + return &mem_cgroup_from_css(css)->vmpr; +} + static inline bool mem_cgroup_is_root(struct mem_cgroup *memcg) { return (memcg == root_mem_cgroup); @@ -6074,6 +6096,11 @@ static struct cftype mem_cgroup_files[] = { .unregister_event = mem_cgroup_oom_unregister_event, .private = MEMFILE_PRIVATE(_OOM_TYPE, OOM_CONTROL), }, + { + .name = "pressure_level", + .register_event = vmpressure_register_event, + .unregister_event = vmpressure_unregister_event, + }, #ifdef CONFIG_NUMA { .name = "numa_stat", @@ -6365,6 +6392,7 @@ mem_cgroup_css_alloc(struct cgroup *cont) memcg->move_charge_at_immigrate = 0; mutex_init(&memcg->thresholds_lock); spin_lock_init(&memcg->move_lock); + vmpressure_init(&memcg->vmpr); return &memcg->css; diff --git a/mm/vmpressure.c b/mm/vmpressure.c new file mode 100644 index 0000000..ae0ff8e --- /dev/null +++ b/mm/vmpressure.c @@ -0,0 +1,252 @@ +/* + * Linux VM pressure + * + * Copyright 2012 Linaro Ltd. + * Anton Vorontsov <anton.vorontsov(a)linaro.org> + * + * Based on ideas from Andrew Morton, David Rientjes, KOSAKI Motohiro, + * Leonid Moiseichuk, Mel Gorman, Minchan Kim and Pekka Enberg. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + */ + +#include <linux/cgroup.h> +#include <linux/fs.h> +#include <linux/sched.h> +#include <linux/mm.h> +#include <linux/vmstat.h> +#include <linux/eventfd.h> +#include <linux/swap.h> +#include <linux/printk.h> +#include <linux/vmpressure.h> + +/* + * The window size is the number of scanned pages before we try to analyze + * the scanned/reclaimed ratio (or difference). + * + * It is used as a rate-limit tunable for the "low" level notification, + * and for averaging medium/critical levels. Using small window sizes can + * cause lot of false positives, but too big window size will delay the + * notifications. + * + * TODO: Make the window size depend on machine size, as we do for vmstat + * thresholds. + */ +static const unsigned int vmpressure_win = SWAP_CLUSTER_MAX * 16; +static const unsigned int vmpressure_level_med = 60; +static const unsigned int vmpressure_level_critical = 95; +static const unsigned int vmpressure_level_critical_prio = 3; + +enum vmpressure_levels { + VMPRESSURE_LOW = 0, + VMPRESSURE_MEDIUM, + VMPRESSURE_CRITICAL, + VMPRESSURE_NUM_LEVELS, +}; + +static const char *vmpressure_str_levels[] = { + [VMPRESSURE_LOW] = "low", + [VMPRESSURE_MEDIUM] = "medium", + [VMPRESSURE_CRITICAL] = "critical", +}; + +static enum vmpressure_levels vmpressure_level(unsigned int pressure) +{ + if (pressure >= vmpressure_level_critical) + return VMPRESSURE_CRITICAL; + else if (pressure >= vmpressure_level_med) + return VMPRESSURE_MEDIUM; + return VMPRESSURE_LOW; +} + +static enum vmpressure_levels vmpressure_calc_level(unsigned int scanned, + unsigned int reclaimed) +{ + unsigned long scale = scanned + reclaimed; + unsigned long pressure; + + if (!scanned) + return VMPRESSURE_LOW; + + /* + * We calculate the ratio (in percents) of how many pages were + * scanned vs. reclaimed in a given time frame (window). Note that + * time is in VM reclaimer's "ticks", i.e. number of pages + * scanned. This makes it possible to set desired reaction time + * and serves as a ratelimit. + */ + pressure = scale - (reclaimed * scale / scanned); + pressure = pressure * 100 / scale; + + pr_debug("%s: %3lu (s: %6u r: %6u)\n", __func__, pressure, + scanned, reclaimed); + + return vmpressure_level(pressure); +} + +void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, + unsigned long scanned, unsigned long reclaimed) +{ + struct vmpressure *vmpr = memcg_to_vmpr(memcg); + + /* + * So far we are only interested application memory, or, in case + * of low pressure, in FS/IO memory reclaim. We are also + * interested indirect reclaim (kswapd sets sc->gfp_mask to + * GFP_KERNEL). + */ + if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO | __GFP_FS))) + return; + + if (!scanned) + return; + + mutex_lock(&vmpr->sr_lock); + vmpr->scanned += scanned; + vmpr->reclaimed += reclaimed; + mutex_unlock(&vmpr->sr_lock); + + if (scanned < vmpressure_win || work_pending(&vmpr->work)) + return; + schedule_work(&vmpr->work); +} + +void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio) +{ + if (prio > vmpressure_level_critical_prio) + return; + + /* + * OK, the prio is below the threshold, updating vmpressure + * information before diving into long shrinking of long range + * vmscan. + */ + vmpressure(gfp, memcg, vmpressure_win, 0); +} + +static struct vmpressure *wk_to_vmpr(struct work_struct *wk) +{ + return container_of(wk, struct vmpressure, work); +} + +static struct vmpressure *cg_to_vmpr(struct cgroup *cg) +{ + return css_to_vmpr(cgroup_subsys_state(cg, mem_cgroup_subsys_id)); +} + +struct vmpressure_event { + struct eventfd_ctx *efd; + enum vmpressure_levels level; + struct list_head node; +}; + +static bool vmpressure_event(struct vmpressure *vmpr, + unsigned long scanned, unsigned long reclaimed) +{ + struct vmpressure_event *ev; + int level = vmpressure_calc_level(scanned, reclaimed); + bool signalled = false; + + mutex_lock(&vmpr->events_lock); + + list_for_each_entry(ev, &vmpr->events, node) { + if (level >= ev->level) { + eventfd_signal(ev->efd, 1); + signalled = true; + } + } + + mutex_unlock(&vmpr->events_lock); + + return signalled; +} + +static struct vmpressure *vmpressure_parent(struct vmpressure *vmpr) +{ + struct cgroup *cg = vmpr_to_css(vmpr)->cgroup; + struct mem_cgroup *memcg = mem_cgroup_from_cont(cg); + + memcg = parent_mem_cgroup(memcg); + if (!memcg) + return NULL; + return memcg_to_vmpr(memcg); +} + +static void vmpressure_wk_fn(struct work_struct *wk) +{ + struct vmpressure *vmpr = wk_to_vmpr(wk); + unsigned long s; + unsigned long r; + + mutex_lock(&vmpr->sr_lock); + s = vmpr->scanned; + r = vmpr->reclaimed; + vmpr->scanned = 0; + vmpr->reclaimed = 0; + mutex_unlock(&vmpr->sr_lock); + + do { + if (vmpressure_event(vmpr, s, r)) + break; + /* + * If not handled, propagate the event upward into the + * hierarchy. + */ + } while ((vmpr = vmpressure_parent(vmpr))); +} + +int vmpressure_register_event(struct cgroup *cg, struct cftype *cft, + struct eventfd_ctx *eventfd, const char *args) +{ + struct vmpressure *vmpr = cg_to_vmpr(cg); + struct vmpressure_event *ev; + int lvl; + + for (lvl = 0; lvl < VMPRESSURE_NUM_LEVELS; lvl++) { + if (!strcmp(vmpressure_str_levels[lvl], args)) + break; + } + + if (lvl >= VMPRESSURE_NUM_LEVELS) + return -EINVAL; + + ev = kzalloc(sizeof(*ev), GFP_KERNEL); + if (!ev) + return -ENOMEM; + + ev->efd = eventfd; + ev->level = lvl; + + mutex_lock(&vmpr->events_lock); + list_add(&ev->node, &vmpr->events); + mutex_unlock(&vmpr->events_lock); + + return 0; +} + +void vmpressure_unregister_event(struct cgroup *cg, struct cftype *cft, + struct eventfd_ctx *eventfd) +{ + struct vmpressure *vmpr = cg_to_vmpr(cg); + struct vmpressure_event *ev; + + mutex_lock(&vmpr->events_lock); + list_for_each_entry(ev, &vmpr->events, node) { + if (ev->efd != eventfd) + continue; + list_del(&ev->node); + kfree(ev); + break; + } + mutex_unlock(&vmpr->events_lock); +} + +void vmpressure_init(struct vmpressure *vmpr) +{ + mutex_init(&vmpr->sr_lock); + mutex_init(&vmpr->events_lock); + INIT_LIST_HEAD(&vmpr->events); + INIT_WORK(&vmpr->work, vmpressure_wk_fn); +} diff --git a/mm/vmscan.c b/mm/vmscan.c index df78d17..616e2bb 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -19,6 +19,7 @@ #include <linux/pagemap.h> #include <linux/init.h> #include <linux/highmem.h> +#include <linux/vmpressure.h> #include <linux/vmstat.h> #include <linux/file.h> #include <linux/writeback.h> @@ -1982,6 +1983,11 @@ static void shrink_zone(struct zone *zone, struct scan_control *sc) } memcg = mem_cgroup_iter(root, memcg, &reclaim); } while (memcg); + + vmpressure(sc->gfp_mask, sc->target_mem_cgroup, + sc->nr_scanned - nr_scanned, + sc->nr_reclaimed - nr_reclaimed); + } while (should_continue_reclaim(zone, sc->nr_reclaimed - nr_reclaimed, sc->nr_scanned - nr_scanned, sc)); } @@ -2167,6 +2173,8 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, count_vm_event(ALLOCSTALL); do { + vmpressure_prio(sc->gfp_mask, sc->target_mem_cgroup, + sc->priority); sc->nr_scanned = 0; aborted_reclaim = shrink_zones(zonelist, sc); -- 1.8.1.4

12 years, 2 months

4
4
0 0

[PATCH v4] memcg: Add memory.pressure_level events

by Anton Vorontsov

With this patch userland applications that want to maintain the interactivity/memory allocation cost can use the pressure level notifications. The levels are defined like this: The "low" level means that the system is reclaiming memory for new allocations. Monitoring this reclaiming activity might be useful for maintaining cache level. Upon notification, the program (typically "Activity Manager") might analyze vmstat and act in advance (i.e. prematurely shutdown unimportant services). The "medium" level means that the system is experiencing medium memory pressure, the system might be making swap, paging out active file caches, etc. Upon this event applications may decide to further analyze vmstat/zoneinfo/memcg or internal memory usage statistics and free any resources that can be easily reconstructed or re-read from a disk. The "critical" level means that the system is actively thrashing, it is about to out of memory (OOM) or even the in-kernel OOM killer is on its way to trigger. Applications should do whatever they can to help the system. It might be too late to consult with vmstat or any other statistics, so it's advisable to take an immediate action. The events are propagated upward until the event is handled, i.e. the events are not pass-through. Here is what this means: for example you have three cgroups: A->B->C. Now you set up an event listener on cgroups A, B and C, and suppose group C experiences some pressure. In this situation, only group C will receive the notification, i.e. groups A and B will not receive it. This is done to avoid excessive "broadcasting" of messages, which disturbs the system and which is especially bad if we are low on memory or thrashing. So, organize the cgroups wisely, or propagate the events manually (or, ask us to implement the pass-through events, explaining why would you need them.) Performance wise, the memory pressure notifications feature itself is lightweight and does not require much of bookkeeping, in contrast to the rest of memcg features. Unfortunately, as of current memcg implementation, pages accounting is an inseparable part and cannot be turned off. The good news is that there are some efforts[1] to improve the situation; plus, implementing the same, fully API-compatible[2] interface for CONFIG_MEMCG=n case (e.g. embedded) is also a viable option, so it will not require any changes on the userland side. [1] http://permalink.gmane.org/gmane.linux.kernel.cgroups/6291 [2] http://lkml.org/lkml/2013/2/21/454 Signed-off-by: Anton Vorontsov <anton.vorontsov(a)linaro.org> Acked-by: Kirill A. Shutemov <kirill(a)shutemov.name> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com> --- Hi all, Thanks for the previous reviews! In v4 I addressed Andrew's and Kamezawa's comments: - Documented public interfaces and tunables; - Added documentation for eventfd interface; - Some cosmetic changes: code rearrangements and variables renames (wk->work, lvl->level, etc.); - Changed types for page counters from 'unsigned int' to 'unsigned long', this avoids possible overflows; - Added Kamezawa's Ack, and rebased onto 3.9.0-rc5-next-20130402+. In v3: - No changes in the code, just updated commit message to incorporate the answer to Minchan Kim's comment regarding applicability to embedded use cases in the light of memcg performance overhead, plus gave some references to Glauber Costa's memcg work. - Rebased onto 3.9.0-rc3-next-20130321. Old changelogs/submissions: v3: http://lkml.org/lkml/2013/3/22/31 v2: http://lkml.org/lkml/2013/2/18/577 v1: http://lkml.org/lkml/2013/2/10/140 mempressure cgroup: http://lkml.org/lkml/2013/1/4/55 Documentation/cgroups/memory.txt | 70 +++++++- include/linux/vmpressure.h | 48 +++++ mm/Makefile | 2 +- mm/memcontrol.c | 29 +++ mm/vmpressure.c | 374 +++++++++++++++++++++++++++++++++++++++ mm/vmscan.c | 8 + 6 files changed, 529 insertions(+), 2 deletions(-) create mode 100644 include/linux/vmpressure.h create mode 100644 mm/vmpressure.c diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index 3aaa984..1178e23 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -40,6 +40,7 @@ Features: - soft limit - moving (recharging) account at moving a task is selectable. - usage threshold notifier + - memory pressure notifier - oom-killer disable knob and oom-notifier - Root cgroup has no limit controls. @@ -65,6 +66,7 @@ Brief summary of control files. memory.stat # show various statistics memory.use_hierarchy # set/show hierarchical account enabled memory.force_empty # trigger forced move charge to parent + memory.pressure_level # set memory pressure notifications memory.swappiness # set/show swappiness parameter of vmscan (See sysctl's vm.swappiness) memory.move_charge_at_immigrate # set/show controls of moving charges @@ -778,7 +780,73 @@ At reading, current status of OOM is shown. under_oom 0 or 1 (if 1, the memory cgroup is under OOM, tasks may be stopped.) -11. TODO +11. Memory Pressure + +The pressure level notifications can be used to monitor the memory +allocation cost; based on the pressure, applications can implement +different strategies of managing their memory resources. The pressure +levels are defined as following: + +The "low" level means that the system is reclaiming memory for new +allocations. Monitoring this reclaiming activity might be useful for +maintaining cache level. Upon notification, the program (typically +"Activity Manager") might analyze vmstat and act in advance (i.e. +prematurely shutdown unimportant services). + +The "medium" level means that the system is experiencing medium memory +pressure, the system might be making swap, paging out active file caches, +etc. Upon this event applications may decide to further analyze +vmstat/zoneinfo/memcg or internal memory usage statistics and free any +resources that can be easily reconstructed or re-read from a disk. + +The "critical" level means that the system is actively thrashing, it is +about to out of memory (OOM) or even the in-kernel OOM killer is on its +way to trigger. Applications should do whatever they can to help the +system. It might be too late to consult with vmstat or any other +statistics, so it's advisable to take an immediate action. + +The events are propagated upward until the event is handled, i.e. the +events are not pass-through. Here is what this means: for example you have +three cgroups: A->B->C. Now you set up an event listener on cgroups A, B +and C, and suppose group C experiences some pressure. In this situation, +only group C will receive the notification, i.e. groups A and B will not +receive it. This is done to avoid excessive "broadcasting" of messages, +which disturbs the system and which is especially bad if we are low on +memory or thrashing. So, organize the cgroups wisely, or propagate the +events manually (or, ask us to implement the pass-through events, +explaining why would you need them.) + +The file memory.pressure_level is only used to setup an eventfd. To +register a notification, an application must: + +- create an eventfd using eventfd(2); +- open memory.pressure_level; +- write string like "<event_fd> <fd of memory.pressure_level> <level>" + to cgroup.event_control. + +Application will be notified through eventfd when memory pressure is at +the specific level (or higher). Read/write operations to +memory.pressure_level are no implemented. + +Test: + + Here is a small script example that makes a new cgroup, sets up a + memory limit, sets up a notification in the cgroup and then makes child + cgroup experience a critical pressure: + + # cd /sys/fs/cgroup/memory/ + # mkdir foo + # cd foo + # cgroup_event_listener memory.pressure_level low & + # echo 8000000 > memory.limit_in_bytes + # echo 8000000 > memory.memsw.limit_in_bytes + # echo $$ > tasks + # dd if=/dev/zero | read x + + (Expect a bunch of notifications, and eventually, the oom-killer will + trigger.) + +12. TODO 1. Add support for accounting huge pages (as a separate controller) 2. Make per-cgroup scanner reclaim not-shared pages first diff --git a/include/linux/vmpressure.h b/include/linux/vmpressure.h new file mode 100644 index 0000000..2e86259 --- /dev/null +++ b/include/linux/vmpressure.h @@ -0,0 +1,48 @@ +#ifndef __LINUX_VMPRESSURE_H +#define __LINUX_VMPRESSURE_H + +#include <linux/mutex.h> +#include <linux/list.h> +#include <linux/workqueue.h> +#include <linux/gfp.h> +#include <linux/types.h> +#include <linux/cgroup.h> + +struct vmpressure { + unsigned long scanned; + unsigned long reclaimed; + /* The lock is used to keep the scanned/reclaimed above in sync. */ + struct mutex sr_lock; + + /* The list of vmpressure_event structs. */ + struct list_head events; + /* Have to grab the lock on events traversal or modifications. */ + struct mutex events_lock; + + struct work_struct work; +}; + +struct mem_cgroup; + +#ifdef CONFIG_MEMCG +extern void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, + unsigned long scanned, unsigned long reclaimed); +extern void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio); +#else +static inline void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, + unsigned long scanned, unsigned long reclaimed) {} +static inline void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, + int prio) {} +#endif /* CONFIG_MEMCG */ + +extern void vmpressure_init(struct vmpressure *vmpr); +extern struct vmpressure *memcg_to_vmpressure(struct mem_cgroup *memcg); +extern struct cgroup_subsys_state *vmpressure_to_css(struct vmpressure *vmpr); +extern struct vmpressure *css_to_vmpressure(struct cgroup_subsys_state *css); +extern int vmpressure_register_event(struct cgroup *cg, struct cftype *cft, + struct eventfd_ctx *eventfd, + const char *args); +extern void vmpressure_unregister_event(struct cgroup *cg, struct cftype *cft, + struct eventfd_ctx *eventfd); + +#endif /* __LINUX_VMPRESSURE_H */ diff --git a/mm/Makefile b/mm/Makefile index 3a46287..72c5acb 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -50,7 +50,7 @@ obj-$(CONFIG_FS_XIP) += filemap_xip.o obj-$(CONFIG_MIGRATION) += migrate.o obj-$(CONFIG_QUICKLIST) += quicklist.o obj-$(CONFIG_TRANSPARENT_HUGEPAGE) += huge_memory.o -obj-$(CONFIG_MEMCG) += memcontrol.o page_cgroup.o +obj-$(CONFIG_MEMCG) += memcontrol.o page_cgroup.o vmpressure.o obj-$(CONFIG_CGROUP_HUGETLB) += hugetlb_cgroup.o obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f608546..64d75a2 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -49,6 +49,7 @@ #include <linux/fs.h> #include <linux/seq_file.h> #include <linux/vmalloc.h> +#include <linux/vmpressure.h> #include <linux/mm_inline.h> #include <linux/page_cgroup.h> #include <linux/cpu.h> @@ -315,6 +316,9 @@ struct mem_cgroup { /* thresholds for mem+swap usage. RCU-protected */ struct mem_cgroup_thresholds memsw_thresholds; + /* vmpressure notifications */ + struct vmpressure vmpressure; + union { /* For oom notifier event fd */ struct list_head oom_notify; @@ -376,6 +380,7 @@ struct mem_cgroup { atomic_t numainfo_events; atomic_t numainfo_updating; #endif + /* * Per cgroup active and inactive list, similar to the * per zone LRU lists. @@ -576,6 +581,24 @@ struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *s) return container_of(s, struct mem_cgroup, css); } +/* Some nice accessors for the vmpressure. */ +struct vmpressure *memcg_to_vmpressure(struct mem_cgroup *memcg) +{ + if (!memcg) + memcg = root_mem_cgroup; + return &memcg->vmpressure; +} + +struct cgroup_subsys_state *vmpressure_to_css(struct vmpressure *vmpr) +{ + return &container_of(vmpr, struct mem_cgroup, vmpressure)->css; +} + +struct vmpressure *css_to_vmpressure(struct cgroup_subsys_state *css) +{ + return &mem_cgroup_from_css(css)->vmpressure; +} + static inline bool mem_cgroup_is_root(struct mem_cgroup *memcg) { return (memcg == root_mem_cgroup); @@ -6074,6 +6097,11 @@ static struct cftype mem_cgroup_files[] = { .unregister_event = mem_cgroup_oom_unregister_event, .private = MEMFILE_PRIVATE(_OOM_TYPE, OOM_CONTROL), }, + { + .name = "pressure_level", + .register_event = vmpressure_register_event, + .unregister_event = vmpressure_unregister_event, + }, #ifdef CONFIG_NUMA { .name = "numa_stat", @@ -6365,6 +6393,7 @@ mem_cgroup_css_alloc(struct cgroup *cont) memcg->move_charge_at_immigrate = 0; mutex_init(&memcg->thresholds_lock); spin_lock_init(&memcg->move_lock); + vmpressure_init(&memcg->vmpressure); return &memcg->css; diff --git a/mm/vmpressure.c b/mm/vmpressure.c new file mode 100644 index 0000000..ccbdc9e --- /dev/null +++ b/mm/vmpressure.c @@ -0,0 +1,374 @@ +/* + * Linux VM pressure + * + * Copyright 2012 Linaro Ltd. + * Anton Vorontsov <anton.vorontsov(a)linaro.org> + * + * Based on ideas from Andrew Morton, David Rientjes, KOSAKI Motohiro, + * Leonid Moiseichuk, Mel Gorman, Minchan Kim and Pekka Enberg. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + */ + +#include <linux/cgroup.h> +#include <linux/fs.h> +#include <linux/log2.h> +#include <linux/sched.h> +#include <linux/mm.h> +#include <linux/vmstat.h> +#include <linux/eventfd.h> +#include <linux/swap.h> +#include <linux/printk.h> +#include <linux/vmpressure.h> + +/* + * The window size (vmpressure_win) is the number of scanned pages before + * we try to analyze scanned/reclaimed ratio. So the window is used as a + * rate-limit tunable for the "low" level notification, and also for + * averaging the ratio for medium/critical levels. Using small window + * sizes can cause lot of false positives, but too big window size will + * delay the notifications. + * + * As the vmscan reclaimer logic works with chunks which are multiple of + * SWAP_CLUSTER_MAX, it makes sense to use it for the window size as well. + * + * TODO: Make the window size depend on machine size, as we do for vmstat + * thresholds. Currently we set it to 512 pages (2MB for 4KB pages). + */ +static const unsigned long vmpressure_win = SWAP_CLUSTER_MAX * 16; + +/* + * These thresholds are used when we account memory pressure through + * scanned/reclaimed ratio. The current values were chosen empirically. In + * essence, they are percents: the higher the value, the more number + * unsuccessful reclaims there were. + */ +static const unsigned int vmpressure_level_med = 60; +static const unsigned int vmpressure_level_critical = 95; + +/* + * When there are too little pages left to scan, vmpressure() may miss the + * critical pressure as number of pages will be less than "window size". + * However, in that case the vmscan priority will raise fast as the + * reclaimer will try to scan LRUs more deeply. + * + * The vmscan logic considers these special priorities: + * + * prio == DEF_PRIORITY (12): reclaimer starts with that value + * prio <= DEF_PRIORITY - 2 : kswapd becomes somewhat overwhelmed + * prio == 0 : close to OOM, kernel scans every page in an lru + * + * Any value in this range is acceptable for this tunable (i.e. from 12 to + * 0). Current value for the vmpressure_level_critical_prio is chosen + * empirically, but the number, in essence, means that we consider + * critical level when scanning depth is ~10% of the lru size (vmscan + * scans 'lru_size >> prio' pages, so it is actually 12.5%, or one + * eights). + */ +static const unsigned int vmpressure_level_critical_prio = ilog2(100 / 10); + +static struct vmpressure *work_to_vmpressure(struct work_struct *work) +{ + return container_of(work, struct vmpressure, work); +} + +static struct vmpressure *cg_to_vmpressure(struct cgroup *cg) +{ + return css_to_vmpressure(cgroup_subsys_state(cg, mem_cgroup_subsys_id)); +} + +static struct vmpressure *vmpressure_parent(struct vmpressure *vmpr) +{ + struct cgroup *cg = vmpressure_to_css(vmpr)->cgroup; + struct mem_cgroup *memcg = mem_cgroup_from_cont(cg); + + memcg = parent_mem_cgroup(memcg); + if (!memcg) + return NULL; + return memcg_to_vmpressure(memcg); +} + +enum vmpressure_levels { + VMPRESSURE_LOW = 0, + VMPRESSURE_MEDIUM, + VMPRESSURE_CRITICAL, + VMPRESSURE_NUM_LEVELS, +}; + +static const char *vmpressure_str_levels[] = { + [VMPRESSURE_LOW] = "low", + [VMPRESSURE_MEDIUM] = "medium", + [VMPRESSURE_CRITICAL] = "critical", +}; + +static enum vmpressure_levels vmpressure_level(unsigned long pressure) +{ + if (pressure >= vmpressure_level_critical) + return VMPRESSURE_CRITICAL; + else if (pressure >= vmpressure_level_med) + return VMPRESSURE_MEDIUM; + return VMPRESSURE_LOW; +} + +static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned, + unsigned long reclaimed) +{ + unsigned long scale = scanned + reclaimed; + unsigned long pressure; + + /* + * We calculate the ratio (in percents) of how many pages were + * scanned vs. reclaimed in a given time frame (window). Note that + * time is in VM reclaimer's "ticks", i.e. number of pages + * scanned. This makes it possible to set desired reaction time + * and serves as a ratelimit. + */ + pressure = scale - (reclaimed * scale / scanned); + pressure = pressure * 100 / scale; + + pr_debug("%s: %3lu (s: %lu r: %lu)\n", __func__, pressure, + scanned, reclaimed); + + return vmpressure_level(pressure); +} + +struct vmpressure_event { + struct eventfd_ctx *efd; + enum vmpressure_levels level; + struct list_head node; +}; + +static bool vmpressure_event(struct vmpressure *vmpr, + unsigned long scanned, unsigned long reclaimed) +{ + struct vmpressure_event *ev; + enum vmpressure_levels level; + bool signalled = false; + + level = vmpressure_calc_level(scanned, reclaimed); + + mutex_lock(&vmpr->events_lock); + + list_for_each_entry(ev, &vmpr->events, node) { + if (level >= ev->level) { + eventfd_signal(ev->efd, 1); + signalled = true; + } + } + + mutex_unlock(&vmpr->events_lock); + + return signalled; +} + +static void vmpressure_work_fn(struct work_struct *work) +{ + struct vmpressure *vmpr = work_to_vmpressure(work); + unsigned long scanned; + unsigned long reclaimed; + + /* + * Several contexts might be calling vmpressure(), so it is + * possible that the work was rescheduled again before the old + * work context cleared the counters. In that case we will run + * just after the old work returns, but then scanned might be zero + * here. No need for any locks here since we don't care if + * vmpr->reclaimed is in sync. + */ + if (!vmpr->scanned) + return; + + mutex_lock(&vmpr->sr_lock); + scanned = vmpr->scanned; + reclaimed = vmpr->reclaimed; + vmpr->scanned = 0; + vmpr->reclaimed = 0; + mutex_unlock(&vmpr->sr_lock); + + do { + if (vmpressure_event(vmpr, scanned, reclaimed)) + break; + /* + * If not handled, propagate the event upward into the + * hierarchy. + */ + } while ((vmpr = vmpressure_parent(vmpr))); +} + +/** + * vmpressure() - Account memory pressure through scanned/reclaimed ratio + * @gfp: reclaimer's gfp mask + * @memcg: cgroup memory controller handle + * @scanned: number of pages scanned + * @reclaimed: number of pages reclaimed + * + * This function should be called from the vmscan reclaim path to account + * "instantaneous" memory pressure (scanned/reclaimed ratio). The raw + * pressure index is then further refined and averaged over time. + * + * This function does not return any value. + */ +void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, + unsigned long scanned, unsigned long reclaimed) +{ + struct vmpressure *vmpr = memcg_to_vmpressure(memcg); + + /* + * Here we only want to account pressure that userland is able to + * help us with. For example, suppose that DMA zone is under + * pressure; if we notify userland about that kind of pressure, + * then it will be mostly a waste as it will trigger unnecessary + * freeing of memory by userland (since userland is more likely to + * have HIGHMEM/MOVABLE pages instead of the DMA fallback). That + * is why we include only movable, highmem and FS/IO pages. + * Indirect reclaim (kswapd) sets sc->gfp_mask to GFP_KERNEL, so + * we account it too. + */ + if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO | __GFP_FS))) + return; + + /* + * If we got here with no pages scanned, then that is an indicator + * that reclaimer was unable to find any shrinkable LRUs at the + * current scanning depth. But it does not mean that we should + * report the critical pressure, yet. If the scanning priority + * (scanning depth) goes too high (deep), we will be notified + * through vmpressure_prio(). But so far, keep calm. + */ + if (!scanned) + return; + + mutex_lock(&vmpr->sr_lock); + vmpr->scanned += scanned; + vmpr->reclaimed += reclaimed; + scanned = vmpr->scanned; + mutex_unlock(&vmpr->sr_lock); + + if (scanned < vmpressure_win || work_pending(&vmpr->work)) + return; + schedule_work(&vmpr->work); +} + +/** + * vmpressure_prio() - Account memory pressure through reclaimer priority level + * @gfp: reclaimer's gfp mask + * @memcg: cgroup memory controller handle + * @prio: reclaimer's priority + * + * This function should be called from the reclaim path every time when + * the vmscan's reclaiming priority (scanning depth) changes. + * + * This function does not return any value. + */ +void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio) +{ + /* + * We only use prio for accounting critical level. For more info + * see comment for vmpressure_level_critical_prio variable above. + */ + if (prio > vmpressure_level_critical_prio) + return; + + /* + * OK, the prio is below the threshold, updating vmpressure + * information before shrinker dives into long shrinking of long + * range vmscan. Passing scanned = vmpressure_win, reclaimed = 0 + * to the vmpressure() basically means that we signal 'critical' + * level. + */ + vmpressure(gfp, memcg, vmpressure_win, 0); +} + +/** + * vmpressure_register_event() - Bind vmpressure notifications to an eventfd + * @cg: cgroup that is interested in vmpressure notifications + * @cft: cgroup control files handle + * @eventfd: eventfd context to link notifications with + * @args: event arguments (used to set up a pressure level threshold) + * + * This function associates eventfd context with the vmpressure + * infrastructure, so that the notifications will be delivered to the + * @eventfd. The @args parameter is a string that denotes pressure level + * threshold (one of vmpressure_str_levels, i.e. "low", "medium", or + * "critical"). + * + * This function should not be used directly, just pass it to (struct + * cftype).register_event, and then cgroup core will handle everything by + * itself. + */ +int vmpressure_register_event(struct cgroup *cg, struct cftype *cft, + struct eventfd_ctx *eventfd, const char *args) +{ + struct vmpressure *vmpr = cg_to_vmpressure(cg); + struct vmpressure_event *ev; + int level; + + for (level = 0; level < VMPRESSURE_NUM_LEVELS; level++) { + if (!strcmp(vmpressure_str_levels[level], args)) + break; + } + + if (level >= VMPRESSURE_NUM_LEVELS) + return -EINVAL; + + ev = kzalloc(sizeof(*ev), GFP_KERNEL); + if (!ev) + return -ENOMEM; + + ev->efd = eventfd; + ev->level = level; + + mutex_lock(&vmpr->events_lock); + list_add(&ev->node, &vmpr->events); + mutex_unlock(&vmpr->events_lock); + + return 0; +} + +/** + * vmpressure_unregister_event() - Unbind eventfd from vmpressure + * @cg: cgroup handle + * @cft: cgroup control files handle + * @eventfd: eventfd context that was used to link vmpressure with the @cg + * + * This function does internal manipulations to detach the @eventfd from + * the vmpressure notifications, and then frees internal resources + * associated with the @eventfd (but the @eventfd itself is not freed). + * + * This function should not be used directly, just pass it to (struct + * cftype).unregister_event, and then cgroup core will handle everything + * by itself. + */ +void vmpressure_unregister_event(struct cgroup *cg, struct cftype *cft, + struct eventfd_ctx *eventfd) +{ + struct vmpressure *vmpr = cg_to_vmpressure(cg); + struct vmpressure_event *ev; + + mutex_lock(&vmpr->events_lock); + list_for_each_entry(ev, &vmpr->events, node) { + if (ev->efd != eventfd) + continue; + list_del(&ev->node); + kfree(ev); + break; + } + mutex_unlock(&vmpr->events_lock); +} + +/** + * vmpressure_init() - Initialize vmpressure control structure + * @vmpr: Structure to be initialized + * + * This function should be called on every allocated vmpressure structure + * before any usage. + */ +void vmpressure_init(struct vmpressure *vmpr) +{ + mutex_init(&vmpr->sr_lock); + mutex_init(&vmpr->events_lock); + INIT_LIST_HEAD(&vmpr->events); + INIT_WORK(&vmpr->work, vmpressure_work_fn); +} diff --git a/mm/vmscan.c b/mm/vmscan.c index df78d17..616e2bb 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -19,6 +19,7 @@ #include <linux/pagemap.h> #include <linux/init.h> #include <linux/highmem.h> +#include <linux/vmpressure.h> #include <linux/vmstat.h> #include <linux/file.h> #include <linux/writeback.h> @@ -1982,6 +1983,11 @@ static void shrink_zone(struct zone *zone, struct scan_control *sc) } memcg = mem_cgroup_iter(root, memcg, &reclaim); } while (memcg); + + vmpressure(sc->gfp_mask, sc->target_mem_cgroup, + sc->nr_scanned - nr_scanned, + sc->nr_reclaimed - nr_reclaimed); + } while (should_continue_reclaim(zone, sc->nr_reclaimed - nr_reclaimed, sc->nr_scanned - nr_scanned, sc)); } @@ -2167,6 +2173,8 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, count_vm_event(ALLOCSTALL); do { + vmpressure_prio(sc->gfp_mask, sc->target_mem_cgroup, + sc->priority); sc->nr_scanned = 0; aborted_reclaim = shrink_zones(zonelist, sc); -- 1.8.1.4

12 years, 2 months

1
0
0 0

[PATCH 1/5] timer: move enum definition out of ifdef section

by Daniel Lezcano

The next patch will setup automatically the broadcast timer for the different cpuidle driver when one idle state stops its timer. This will be part of the generic code. But some ARM boards, like s3c64xx, uses cpuidle but without the CONFIG_GENERIC_CLOCKEVENTS_BUILD set. Hence the cpuidle framework will be compiled with the code supposed to be generic, that is with clockevents_notify and the different enum. Also the function clockevents_notify is a noop macro, this is fine except the usual code is: int cpu = smp_processor_id(); clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ON, &cpu); and that raises a warning for the variable cpu which is not used. Move the clock_event_nofitiers enum definition out of the CONFIG_GENERIC_CLOCKEVENTS_BUILD section to prevent a compilation error when these are used in the code. Change the clockevents_notify macro to a static inline noop function to prevent a compilation warning. Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org> --- include/linux/clockchips.h | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h index 6634652..f9fd937 100644 --- a/include/linux/clockchips.h +++ b/include/linux/clockchips.h @@ -8,6 +8,20 @@ #ifndef _LINUX_CLOCKCHIPS_H #define _LINUX_CLOCKCHIPS_H +/* Clock event notification values */ +enum clock_event_nofitiers { + CLOCK_EVT_NOTIFY_ADD, + CLOCK_EVT_NOTIFY_BROADCAST_ON, + CLOCK_EVT_NOTIFY_BROADCAST_OFF, + CLOCK_EVT_NOTIFY_BROADCAST_FORCE, + CLOCK_EVT_NOTIFY_BROADCAST_ENTER, + CLOCK_EVT_NOTIFY_BROADCAST_EXIT, + CLOCK_EVT_NOTIFY_SUSPEND, + CLOCK_EVT_NOTIFY_RESUME, + CLOCK_EVT_NOTIFY_CPU_DYING, + CLOCK_EVT_NOTIFY_CPU_DEAD, +}; + #ifdef CONFIG_GENERIC_CLOCKEVENTS_BUILD #include <linux/clocksource.h> @@ -26,20 +40,6 @@ enum clock_event_mode { CLOCK_EVT_MODE_RESUME, }; -/* Clock event notification values */ -enum clock_event_nofitiers { - CLOCK_EVT_NOTIFY_ADD, - CLOCK_EVT_NOTIFY_BROADCAST_ON, - CLOCK_EVT_NOTIFY_BROADCAST_OFF, - CLOCK_EVT_NOTIFY_BROADCAST_FORCE, - CLOCK_EVT_NOTIFY_BROADCAST_ENTER, - CLOCK_EVT_NOTIFY_BROADCAST_EXIT, - CLOCK_EVT_NOTIFY_SUSPEND, - CLOCK_EVT_NOTIFY_RESUME, - CLOCK_EVT_NOTIFY_CPU_DYING, - CLOCK_EVT_NOTIFY_CPU_DEAD, -}; - /* * Clock event features */ @@ -173,7 +173,7 @@ extern int tick_receive_broadcast(void); #ifdef CONFIG_GENERIC_CLOCKEVENTS extern void clockevents_notify(unsigned long reason, void *arg); #else -# define clockevents_notify(reason, arg) do { } while (0) +static inline void clockevents_notify(unsigned long reason, void *arg) {} #endif #else /* CONFIG_GENERIC_CLOCKEVENTS_BUILD */ @@ -181,7 +181,7 @@ extern void clockevents_notify(unsigned long reason, void *arg); static inline void clockevents_suspend(void) {} static inline void clockevents_resume(void) {} -#define clockevents_notify(reason, arg) do { } while (0) +static inline void clockevents_notify(unsigned long reason, void *arg) {} #endif -- 1.7.9.5

12 years, 2 months

2
8
0 0

[PATCH] tools: cpufreq: Fix cpufreq-info print messages for affected[related]_cpus

by Viresh Kumar

Earlier definitions of affected and related cpus were: Related_cpus: CPUs which run at the same hardware frequency. Affected_cpus: CPUs which need to have their frequency coordinated by software. These definitions were very confusing as they don't communicate the real difference between them. Following are the new definitions of these variables: Related_cpus: All (Online & Offline) CPUs that run at the same hardware frequency. Affected_cpus: Online CPUs that run at the same hardware frequency. Above definitions are more consistent with latest cpufreq core code. Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> --- tools/power/cpupower/utils/cpufreq-info.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/power/cpupower/utils/cpufreq-info.c b/tools/power/cpupower/utils/cpufreq-info.c index 28953c9..a81d4ec 100644 --- a/tools/power/cpupower/utils/cpufreq-info.c +++ b/tools/power/cpupower/utils/cpufreq-info.c @@ -247,7 +247,7 @@ static void debug_output_one(unsigned int cpu) cpus = cpufreq_get_related_cpus(cpu); if (cpus) { - printf(_(" CPUs which run at the same hardware frequency: ")); + printf(_(" All (Online & Offline) CPUs that run at the same hardware frequency: ")); while (cpus->next) { printf("%d ", cpus->cpu); cpus = cpus->next; @@ -258,7 +258,7 @@ static void debug_output_one(unsigned int cpu) cpus = cpufreq_get_affected_cpus(cpu); if (cpus) { - printf(_(" CPUs which need to have their frequency coordinated by software: ")); + printf(_(" Online CPUs that run at the same hardware frequency: ")); while (cpus->next) { printf("%d ", cpus->cpu); cpus = cpus->next; -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

3
4
0 0

[PATCH v4] clk: allow reentrant calls into the clk framework

by Mike Turquette

Reentrancy into the clock framework from the clk.h api is necessary for clocks that are prepared and unprepared via i2c_transfer (which includes many PMICs and discrete audio chips) as well as for several other use cases. This patch implements reentrancy by adding two global atomic_t's which track the context of the current caller. Context in this case is the return value from get_current(). One context variable is for slow operations protected by the prepare_mutex and the other is for fast operations protected by the enable_lock spinlock. The clk.h api implementations are modified to first see if the relevant global lock is already held and if so compare the global context (set by whoever is holding the lock) against their own context (via a call to get_current()). If the two match then this function is a nested call from the one already holding the lock and we procede. If the context does not match then procede to call mutex_lock and busy-wait for the existing task to complete. This patch does not increase concurrency for unrelated calls into the clock framework. Instead it simply allows reentrancy by the single task which is currently holding the global clock framework lock. Signed-off-by: Mike Turquette <mturquette(a)linaro.org> Cc: Rajagopal Venkat <rajagopal.venkat(a)linaro.org> Cc: David Brown <davidb(a)codeaurora.org> Cc: Ulf Hansson <ulf.hansson(a)linaro.org> Cc: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com> --- drivers/clk/clk.c | 255 ++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 186 insertions(+), 69 deletions(-) diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c index 5e8ffff..17432a5 100644 --- a/drivers/clk/clk.c +++ b/drivers/clk/clk.c @@ -19,9 +19,12 @@ #include <linux/of.h> #include <linux/device.h> #include <linux/init.h> +#include <linux/sched.h> static DEFINE_SPINLOCK(enable_lock); static DEFINE_MUTEX(prepare_lock); +static atomic_t prepare_context; +static atomic_t enable_context; static HLIST_HEAD(clk_root_list); static HLIST_HEAD(clk_orphan_list); @@ -456,27 +459,6 @@ unsigned int __clk_get_prepare_count(struct clk *clk) return !clk ? 0 : clk->prepare_count; } -unsigned long __clk_get_rate(struct clk *clk) -{ - unsigned long ret; - - if (!clk) { - ret = 0; - goto out; - } - - ret = clk->rate; - - if (clk->flags & CLK_IS_ROOT) - goto out; - - if (!clk->parent) - ret = 0; - -out: - return ret; -} - unsigned long __clk_get_flags(struct clk *clk) { return !clk ? 0 : clk->flags; @@ -566,6 +548,35 @@ struct clk *__clk_lookup(const char *name) return NULL; } +/*** locking & reentrancy ***/ + +static void clk_fwk_lock(void) +{ + /* hold the framework-wide lock, context == NULL */ + mutex_lock(&prepare_lock); + + /* set context for any reentrant calls */ + atomic_set(&prepare_context, (int) get_current()); +} + +static void clk_fwk_unlock(void) +{ + /* clear the context */ + atomic_set(&prepare_context, 0); + + /* release the framework-wide lock, context == NULL */ + mutex_unlock(&prepare_lock); +} + +static bool clk_is_reentrant(void) +{ + if (mutex_is_locked(&prepare_lock)) + if ((void *) atomic_read(&prepare_context) == get_current()) + return true; + + return false; +} + /*** clk api ***/ void __clk_unprepare(struct clk *clk) @@ -600,9 +611,15 @@ void __clk_unprepare(struct clk *clk) */ void clk_unprepare(struct clk *clk) { - mutex_lock(&prepare_lock); + /* re-enter if call is from the same context */ + if (clk_is_reentrant()) { + __clk_unprepare(clk); + return; + } + + clk_fwk_lock(); __clk_unprepare(clk); - mutex_unlock(&prepare_lock); + clk_fwk_unlock(); } EXPORT_SYMBOL_GPL(clk_unprepare); @@ -648,10 +665,16 @@ int clk_prepare(struct clk *clk) { int ret; - mutex_lock(&prepare_lock); - ret = __clk_prepare(clk); - mutex_unlock(&prepare_lock); + /* re-enter if call is from the same context */ + if (clk_is_reentrant()) { + ret = __clk_prepare(clk); + goto out; + } + clk_fwk_lock(); + ret = __clk_prepare(clk); + clk_fwk_unlock(); +out: return ret; } EXPORT_SYMBOL_GPL(clk_prepare); @@ -692,8 +715,27 @@ void clk_disable(struct clk *clk) { unsigned long flags; + /* this call re-enters if it is from the same context */ + if (spin_is_locked(&enable_lock)) { + if ((void *) atomic_read(&enable_context) == get_current()) { + __clk_disable(clk); + return; + } + } + + /* hold the framework-wide lock, context == NULL */ spin_lock_irqsave(&enable_lock, flags); + + /* set context for any reentrant calls */ + atomic_set(&enable_context, (int) get_current()); + + /* disable the clock(s) */ __clk_disable(clk); + + /* clear the context */ + atomic_set(&enable_context, 0); + + /* release the framework-wide lock, context == NULL */ spin_unlock_irqrestore(&enable_lock, flags); } EXPORT_SYMBOL_GPL(clk_disable); @@ -745,10 +787,29 @@ int clk_enable(struct clk *clk) unsigned long flags; int ret; + /* this call re-enters if it is from the same context */ + if (spin_is_locked(&enable_lock)) { + if ((void *) atomic_read(&enable_context) == get_current()) { + ret = __clk_enable(clk); + goto out; + } + } + + /* hold the framework-wide lock, context == NULL */ spin_lock_irqsave(&enable_lock, flags); + + /* set context for any reentrant calls */ + atomic_set(&enable_context, (int) get_current()); + + /* enable the clock(s) */ ret = __clk_enable(clk); - spin_unlock_irqrestore(&enable_lock, flags); + /* clear the context */ + atomic_set(&enable_context, 0); + + /* release the framework-wide lock, context == NULL */ + spin_unlock_irqrestore(&enable_lock, flags); +out: return ret; } EXPORT_SYMBOL_GPL(clk_enable); @@ -792,10 +853,17 @@ long clk_round_rate(struct clk *clk, unsigned long rate) { unsigned long ret; - mutex_lock(&prepare_lock); + /* this call re-enters if it is from the same context */ + if (clk_is_reentrant()) { + ret = __clk_round_rate(clk, rate); + goto out; + } + + clk_fwk_lock(); ret = __clk_round_rate(clk, rate); - mutex_unlock(&prepare_lock); + clk_fwk_unlock(); +out: return ret; } EXPORT_SYMBOL_GPL(clk_round_rate); @@ -877,6 +945,30 @@ static void __clk_recalc_rates(struct clk *clk, unsigned long msg) __clk_recalc_rates(child, msg); } +unsigned long __clk_get_rate(struct clk *clk) +{ + unsigned long ret; + + if (!clk) { + ret = 0; + goto out; + } + + if (clk->flags & CLK_GET_RATE_NOCACHE) + __clk_recalc_rates(clk, 0); + + ret = clk->rate; + + if (clk->flags & CLK_IS_ROOT) + goto out; + + if (!clk->parent) + ret = 0; + +out: + return ret; +} + /** * clk_get_rate - return the rate of clk * @clk: the clk whose rate is being returned @@ -889,14 +981,22 @@ unsigned long clk_get_rate(struct clk *clk) { unsigned long rate; - mutex_lock(&prepare_lock); + /* + * FIXME - any locking here seems heavy weight + * can clk->rate be replaced with an atomic_t? + * same logic can likely be applied to prepare_count & enable_count + */ - if (clk && (clk->flags & CLK_GET_RATE_NOCACHE)) - __clk_recalc_rates(clk, 0); + if (clk_is_reentrant()) { + rate = __clk_get_rate(clk); + goto out; + } + clk_fwk_lock(); rate = __clk_get_rate(clk); - mutex_unlock(&prepare_lock); + clk_fwk_unlock(); +out: return rate; } EXPORT_SYMBOL_GPL(clk_get_rate); @@ -1073,6 +1173,39 @@ static void clk_change_rate(struct clk *clk) clk_change_rate(child); } +int __clk_set_rate(struct clk *clk, unsigned long rate) +{ + int ret = 0; + struct clk *top, *fail_clk; + + /* bail early if nothing to do */ + if (rate == clk->rate) + return 0; + + if ((clk->flags & CLK_SET_RATE_GATE) && clk->prepare_count) { + return -EBUSY; + } + + /* calculate new rates and get the topmost changed clock */ + top = clk_calc_new_rates(clk, rate); + if (!top) + return -EINVAL; + + /* notify that we are about to change rates */ + fail_clk = clk_propagate_rate_change(top, PRE_RATE_CHANGE); + if (fail_clk) { + pr_warn("%s: failed to set %s rate\n", __func__, + fail_clk->name); + clk_propagate_rate_change(top, ABORT_RATE_CHANGE); + return -EBUSY; + } + + /* change the rates */ + clk_change_rate(top); + + return ret; +} + /** * clk_set_rate - specify a new rate for clk * @clk: the clk whose rate is being changed @@ -1096,44 +1229,18 @@ static void clk_change_rate(struct clk *clk) */ int clk_set_rate(struct clk *clk, unsigned long rate) { - struct clk *top, *fail_clk; int ret = 0; - /* prevent racing with updates to the clock topology */ - mutex_lock(&prepare_lock); - - /* bail early if nothing to do */ - if (rate == clk->rate) - goto out; - - if ((clk->flags & CLK_SET_RATE_GATE) && clk->prepare_count) { - ret = -EBUSY; - goto out; - } - - /* calculate new rates and get the topmost changed clock */ - top = clk_calc_new_rates(clk, rate); - if (!top) { - ret = -EINVAL; - goto out; - } - - /* notify that we are about to change rates */ - fail_clk = clk_propagate_rate_change(top, PRE_RATE_CHANGE); - if (fail_clk) { - pr_warn("%s: failed to set %s rate\n", __func__, - fail_clk->name); - clk_propagate_rate_change(top, ABORT_RATE_CHANGE); - ret = -EBUSY; + if (clk_is_reentrant()) { + ret = __clk_set_rate(clk, rate); goto out; } - /* change the rates */ - clk_change_rate(top); + clk_fwk_lock(); + ret = __clk_set_rate(clk, rate); + clk_fwk_unlock(); out: - mutex_unlock(&prepare_lock); - return ret; } EXPORT_SYMBOL_GPL(clk_set_rate); @@ -1148,10 +1255,16 @@ struct clk *clk_get_parent(struct clk *clk) { struct clk *parent; - mutex_lock(&prepare_lock); + if (clk_is_reentrant()) { + parent = __clk_get_parent(clk); + goto out; + } + + clk_fwk_lock(); parent = __clk_get_parent(clk); - mutex_unlock(&prepare_lock); + clk_fwk_unlock(); +out: return parent; } EXPORT_SYMBOL_GPL(clk_get_parent); @@ -1330,6 +1443,7 @@ out: int clk_set_parent(struct clk *clk, struct clk *parent) { int ret = 0; + bool reenter; if (!clk || !clk->ops) return -EINVAL; @@ -1337,8 +1451,10 @@ int clk_set_parent(struct clk *clk, struct clk *parent) if (!clk->ops->set_parent) return -ENOSYS; - /* prevent racing with updates to the clock topology */ - mutex_lock(&prepare_lock); + reenter = clk_is_reentrant(); + + if (!reenter) + clk_fwk_lock(); if (clk->parent == parent) goto out; @@ -1367,7 +1483,8 @@ int clk_set_parent(struct clk *clk, struct clk *parent) __clk_reparent(clk, parent); out: - mutex_unlock(&prepare_lock); + if (!reenter) + clk_fwk_unlock(); return ret; } -- 1.7.10.4

12 years, 2 months

6
26
0 0

[PATCH v10 0/3] Add DRM FIMD DT support for Exynos4 DT Machines

by Vikas Sajjan

This patch series adds support for DRM FIMD DT for Exynos4 DT Machines, specifically for Exynos4412 SoC. changes since v9: - dropped the patch "ARM: dts: Add lcd pinctrl node entries for EXYNOS4412 SoC" as the gpios in the newly added nodes "lcd_en" and "lcd_sync" in this patch were already PULLed high by existing "lcd_clk" node. - addressed comments from Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com> and Thomas Abraham <thomas.abraham(a)linaro.org> changes since v8: - addressed comments to add missing documentation for clock and clock-names properties as pointed out by Sachin Kamat <sachin.kamat(a)linaro.org> changes since v7: - rebased to kgene's "for-next" - Migrated to Common Clock Framework - removed the patch "ARM: dts: Add FIMD AUXDATA node entry for exynos4 DT", as migration to Common Clock Framework will NOT need this. - addressed the comments raised by Sachin Kamat <sachin.kamat(a)linaro.org> changes since v6: - addressed comments and added interrupt-names = "fifo", "vsync", "lcd_sys" in exynos4.dtsi and re-ordered the interrupt numbering to match the order in interrupt combiner IP as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>. changes since v5: - renamed the fimd binding documentation file name as "samsung-fimd.txt", since it not only talks about exynos display controller but also about previous samsung display controllers. - rephrased an abmigious statement about the interrupt combiner in the fimd binding documentation as pointed out by Sachin Kamat <sachin.kamat(a)linaro.org> changes since v4: - moved the fimd binding documentation to Documentation/devicetree/bindings/video/ as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com> - added more fimd compatiblity strings in fimd documentation as discussed at https://patchwork.kernel.org/patch/2144861/ with Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com> and Tomasz Figa <tomasz.figa(a)gmail.com> - modified compatible string for exynos4 fimd as "exynos4210-fimd" exynos5 fimd as "exynos5250-fimd" to stick to the rule that compatible value should be named after first specific SoC model in which this particular IP version was included as discussed at https://patchwork.kernel.org/patch/2144861/ - documented more about the interrupt combiner and their order as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com> changes since v3: - rebased on http://git.kernel.org/?p=linux/kernel/git/kgene/linux-samsung.git;a=shortlo… changes since v2: - added alias to 'fimd@11c00000' node (reported by: Rahul Sharma <r.sh.open(a)gmail.com>) - removed 'lcd0_data' node as there was already a similar node lcd_data24 (reported by: Jingoo Han <jg1.han(a)samsung.com> - replaced spaces with tabs in display-timing node changes since v1: - added new patch to add FIMD DT binding Documentation - removed patch enabling SAMSUNG_DEV_BACKLIGHT and SAMSUNG_DEV_PMW for mach-exynos4 DT - added 'status' property to fimd DT node Is based on branch kgene's "for-next" https://git.kernel.org/cgit/linux/kernel/git/kgene/linux-samsung.git/log/?h… Vikas Sajjan (3): ARM: dts: Add FIMD node to exynos4 ARM: dts: Add FIMD node and display timing node to exynos4412-origen.dts ARM: dts: Add FIMD DT binding Documentation .../devicetree/bindings/video/samsung-fimd.txt | 65 ++++++++++++++++++++ arch/arm/boot/dts/exynos4.dtsi | 12 ++++ arch/arm/boot/dts/exynos4412-origen.dts | 21 +++++++ 3 files changed, 98 insertions(+) create mode 100644 Documentation/devicetree/bindings/video/samsung-fimd.txt -- 1.7.9.5

12 years, 2 months

2
4
0 0

[PATCH v7 0/5] Add ST-Ericsson AB8500 HWMON driver

by Hongbo Zhang

Guenter and Anton, Only one minor update as Guenter mentioned for v6, thank you. v6 -> v7 changes: - move exporting symbols from [5/5] to [4/5], which was a mistake. v5 -> v6 changes: - add depend on AB8500_BM in Kconfig - fix wrong usage of clamp_val() - export symbols for module compiling v4 -> v5 changes: - split the old [2/3]-ab8500-re-arrange-ab8500-power-and-temperature-data into new three [2/5], [3/5] and [4/5] patches. - hwmon driver minor coding style clean ups: - {} usage in if-else statement in ab8500_read_sensor function - index error fix in gpadc_monitor function - fix issue of clamp_val() usage - remove unnecessary else in function abx500_attrs_visible - remove redundant print message about irq set up - return the calling function return value directly in probe function v3 -> v4 changes: for patch [3/3] - define delays in HZ - update ab8500_read_sensor function, returning temp by parameter - remove ab8500_is_visible function - use clamp_val in set_min and set_max callback - remove unnecessary locks in remove and suspend functions - let abx500 and ab8500 use its own data structure for patch [2/3] - move the data tables from driver/power/ab8500_bmdata.c to include/linux/power/ab8500.h - rename driver/power/ab8500_bmdata.c to driver/power/ab8500_bm.c - rename these variable names to eliminate CamelCase warnings - add const attribute to these data v2 -> v3 changes: - Add interface for converting voltage to temperature - Remove temp5 sensor since we cannot offer temperature read interface of it - Update hyst to use absolute temperature instead of a difference - Add the 3/3 patch v1 -> v2 changes: - Add Documentation/hwmon/abx500 and Documentation/hwmon/abx500 - Make devices which cannot report milli-Celsius invisible - Add temp5_crit interface - Re-work the old find_active_thresholds() to threshold_updated() - Reset updated_min_alarm and updated_max_alarm at the end of each loop - Update the hyst mechamisn to make it works as real hyst - Remove non-stand attributes - Re-order the operations sequence inside probe and remove functions - Update all the lock usages to eliminate race conditions - Make attibutes index starts from 0 also changes: - Since the old [1/2] "ARM: ux500: rename ab8500 to abx500 for hwmon driver" has been merged by Samuel, so won't send it again. - Add another new patch "ab8500_btemp: export two symblols" as [2/2] of this patch set. Hongbo Zhang (5): ab8500_btemp: make ab8500_btemp_get* interfaces public ab8500: power: eliminate CamelCase warning of some variables ab8500: power: add const attributes to some data arrays ab8500: power: export abx500_res_to_temp tables for hwmon hwmon: add ST-Ericsson ABX500 hwmon driver Documentation/hwmon/ab8500 | 22 ++ Documentation/hwmon/abx500 | 28 ++ drivers/hwmon/Kconfig | 13 + drivers/hwmon/Makefile | 1 + drivers/hwmon/ab8500.c | 206 +++++++++++++++ drivers/hwmon/abx500.c | 491 +++++++++++++++++++++++++++++++++++ drivers/hwmon/abx500.h | 69 +++++ drivers/power/ab8500_bmdata.c | 42 +-- drivers/power/ab8500_btemp.c | 5 +- drivers/power/ab8500_fg.c | 4 +- include/linux/mfd/abx500.h | 6 +- include/linux/mfd/abx500/ab8500-bm.h | 5 + include/linux/power/ab8500.h | 16 ++ 13 files changed, 885 insertions(+), 23 deletions(-) create mode 100644 Documentation/hwmon/ab8500 create mode 100644 Documentation/hwmon/abx500 create mode 100644 drivers/hwmon/ab8500.c create mode 100644 drivers/hwmon/abx500.c create mode 100644 drivers/hwmon/abx500.h create mode 100644 include/linux/power/ab8500.h -- 1.8.0

12 years, 2 months

3
14
0 0

[ACTIVITY] 2013-03-23 - 2012-03-29

by David Long

=== David Long === === Highlights === * I returned to and fixed the uprobe xol issue in my upleveled version of Rabin's patches. * Thanks to Rabin for offering to look at the above problem, although I was able to fix it without his help. * I've updated my Ubuntu PC and I have been using that for comparisons of uprobes behavior. === Plans === * Switch back to working on the code restructuring, === Issues === === Travel/Time Off === * I unexpectedly had to take vacation on Monday and Tuesday for personal business. -dl

12 years, 2 months

1
0
0 0

[PATCH] cpufreq: cpufreq-cpu0: Call CPUFREQ_POSTCHANGE notifier for failure cases too

by Viresh Kumar

Currently we are simply returning from target() if we encounter some error after broadcasting CPUFREQ_PRECHANGE notifier. Which looks to be wrong as others might depend on POSTCHANGE notifier for their functioning. So, better broadcast CPUFREQ_POSTCHANGE notifier for these failure cases too, but with old frequency. Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> --- drivers/cpufreq/cpufreq-cpu0.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/cpufreq/cpufreq-cpu0.c b/drivers/cpufreq/cpufreq-cpu0.c index 1cab820..6561853 100644 --- a/drivers/cpufreq/cpufreq-cpu0.c +++ b/drivers/cpufreq/cpufreq-cpu0.c @@ -74,7 +74,9 @@ static int cpu0_set_target(struct cpufreq_policy *policy, if (IS_ERR(opp)) { rcu_read_unlock(); pr_err("failed to find OPP for %ld\n", freq_Hz); - return PTR_ERR(opp); + freqs.new = freqs.old; + ret = PTR_ERR(opp); + goto post_notify; } volt = opp_get_voltage(opp); rcu_read_unlock(); @@ -92,7 +94,7 @@ static int cpu0_set_target(struct cpufreq_policy *policy, if (ret) { pr_err("failed to scale voltage up: %d\n", ret); freqs.new = freqs.old; - return ret; + goto post_notify; } } @@ -101,7 +103,8 @@ static int cpu0_set_target(struct cpufreq_policy *policy, pr_err("failed to set clock rate: %d\n", ret); if (cpu_reg) regulator_set_voltage_tol(cpu_reg, volt_old, tol); - return ret; + freqs.new = freqs.old; + goto post_notify; } /* scaling down? scale voltage after frequency */ @@ -111,13 +114,13 @@ static int cpu0_set_target(struct cpufreq_policy *policy, pr_err("failed to scale voltage down: %d\n", ret); clk_set_rate(cpu_clk, freqs.old * 1000); freqs.new = freqs.old; - return ret; } } +post_notify: cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE); - return 0; + return ret; } static int cpu0_cpufreq_init(struct cpufreq_policy *policy) -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

2
1
0 0

[PATCH] cpufreq: cpufreq-cpu0: No need to check cpu number in init()

by Viresh Kumar

It is not possible for init() to be called for any cpu other than cpu0. During bootup whatever cpu is used to boot system will be assigned as cpu0. And later on policy->cpu can only change if we hotunplug all cpus first and then hotplug them back in different order, which isn't possible (system requires atleast one cpu to be up always :)). Though I can see one situation where policy->cpu can be different then zero. - Hot-unplug cpu 0. - rmmod cpufreq-cpu0 module - insmod it back - hotplug cpu 0 again. Here, policy->cpu would be different. But the driver doesn't have any dependency on cpu0 as such. We don't mind which cpu of a system is policy->cpu and so this check is just not required. Remove it. Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> --- drivers/cpufreq/cpufreq-cpu0.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/cpufreq/cpufreq-cpu0.c b/drivers/cpufreq/cpufreq-cpu0.c index 0f16267..1cab820 100644 --- a/drivers/cpufreq/cpufreq-cpu0.c +++ b/drivers/cpufreq/cpufreq-cpu0.c @@ -124,9 +124,6 @@ static int cpu0_cpufreq_init(struct cpufreq_policy *policy) { int ret; - if (policy->cpu != 0) - return -EINVAL; - ret = cpufreq_frequency_table_cpuinfo(policy, freq_table); if (ret) { pr_err("invalid frequency table: %d\n", ret); -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

2
1
0 0

[PATCH V4 0/4] Queue work on UNBOUND wq

by Viresh Kumar

This patchset was called: "Create sched_select_cpu() and use it for workqueues" for the first three versions. Earlier discussions over v3, v2 and v1 can be found here: https://lkml.org/lkml/2013/3/18/364 http://lists.linaro.org/pipermail/linaro-dev/2012-November/014344.html http://www.mail-archive.com/linaro-dev@lists.linaro.org/msg13342.html For power saving it is better to schedule work on cpus that aren't idle, as bringing a cpu/cluster from idle state can be very costly (both performance and power wise). Earlier we tried to use timer infrastructure to take this decision but we found out later that scheduler gives even better results and so we should use scheduler for choosing cpu for scheduling work. In workqueue subsystem workqueues with flag WQ_UNBOUND are the ones which uses cpu to select target cpu. Here we are migrating few users of workqueues to WQ_UNBOUND. These drivers are found to be very much active on idle or lightly busy system and using WQ_UNBOUND for these gave impressive results. Setup: ----- - ARM Vexpress TC2 - big.LITTLE CPU - Core 0-1: A15, 2-4: A7 - rootfs: linaro-ubuntu-devel This patchset has been tested on a big LITTLE system (heterogeneous) but is useful for all other homogeneous systems as well. During these tests audio was played in background using aplay. Results: ------- Cluster A15 Energy Cluster A7 Energy Total ------------------------- ----------------------- ------ Without this patchset (Energy in Joules): --------------------------------------------------- 0.151162 2.183545 2.334707 0.223730 2.687067 2.910797 0.289687 2.732702 3.022389 0.454198 2.745908 3.200106 0.495552 2.746465 3.242017 Average: 0.322866 2.619137 2.942003 With this patchset (Energy in Joules): ----------------------------------------------- 0.226421 2.283658 2.510079 0.151361 2.236656 2.388017 0.197726 2.249849 2.447575 0.221915 2.229446 2.451361 0.347098 2.257707 2.604805 Average: 0.2289042 2.2514632 2.4803674 Above tests are repeated multiple times and events are tracked using trace-cmd and analysed using kernelshark. And it was easily noticeable that idle time for many cpus has increased considerably, which eventually saved some power. PS: All the earlier Acks we got for drivers are reverted here as patches have been updated significantly. V3->V4: ------- - Dropped changes to kernel/sched directory and hence sched_select_non_idle_cpu(). - Dropped queue_work_on_any_cpu() - Created system_freezable_unbound_wq - Changed all patches accordingly. V2->V3: ------- - Dropped changes into core queue_work() API, rather create *_on_any_cpu() APIs - Dropped running timers migration patch as that was broken - Migrated few users of workqueues to use *_on_any_cpu() APIs. Viresh Kumar (4): workqueue: Add system wide system_freezable_unbound_wq PHYLIB: queue work on unbound wq block: queue work on unbound wq fbcon: queue work on unbound wq block/blk-core.c | 3 ++- block/blk-ioc.c | 2 +- block/genhd.c | 10 ++++++---- drivers/net/phy/phy.c | 9 +++++---- drivers/video/console/fbcon.c | 2 +- include/linux/workqueue.h | 4 ++++ kernel/workqueue.c | 7 ++++++- 7 files changed, 25 insertions(+), 12 deletions(-) -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

1
5
0 0

[PATCH V4 0/2] Implement per policy instance of governor

by Viresh Kumar

Hi Guys, All patches are pushed here for others to apply (you can apply from mail to): http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/h… Currently, there can't be multiple instances of single governor_type. If we have a multi-package system, where we have multiple instances of struct policy (per package), we can't have multiple instances of same governor. i.e. We can't have multiple instances of ondemand governor for multiple packages. Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/ governor-name/. Which again reflects that there can be only one instance of a governor_type in the system. This is a bottleneck for multicluster system, where we want different packages to use same governor type, but with different tunables. This patchset is inclined towards fixing this issue. Now we will create governors directory in cpu/cpu*/cpufreq/<gov> for platforms which have multiple struct policy alive at any moment. Platform drivers requiring this feature must set have_governor_per_policy variable in their instance of cpufreq_driver. For others the interface is kept same: cpu/cpufreq/<gov>. This is V4 of this patchset. V3 is already applied by Rafael in his linux-next branch. Jacob Shin reported some regressions with this patchset and when I went into testing it with his configuration I found more issues then what he reported. To test these over linux-next you need to revert following first: db9baec cpufreq: Get rid of "struct global_attr" 86bd6f0 cpufreq: governor: Implement per policy instances of governors 8ae67b1 cpufreq: Add per policy governor-init/exit infrastructure I have tested this for following now and believe there are no more regressions with it: - platform with a single policy instance or single group of cpu - platform with multiple policies but which don't want per policy instance of governor - platform with multiple policies and which want per policy instance of governor I have tried with different settings and combinations of governors. @Rafael: To simplify your life I have sorted out your branch and you can simply pickup the complete branch that I have pushed. V3->V4: - We have two instances of all show/store routines for ondemand/conservative governor. One for per-policy instance of governor and other for one governor instance for all policies. - Dropped: db9baec cpufreq: Get rid of "struct global_attr". - Fixed cpufreq_governor_dbs for multiple policies using same governor instance. - Implemented few macro's in cpufreq_governor.h to make above stuff clean. - Renamed have_multiple_policies to have_governor_per_policy - Some more minor cleanups Viresh Kumar (2): cpufreq: Add per policy governor-init/exit infrastructure cpufreq: governor: Implement per policy instances of governors drivers/cpufreq/cpufreq.c | 36 ++++- drivers/cpufreq/cpufreq_conservative.c | 193 ++++++++++++++---------- drivers/cpufreq/cpufreq_governor.c | 212 +++++++++++++++++--------- drivers/cpufreq/cpufreq_governor.h | 117 +++++++++++++-- drivers/cpufreq/cpufreq_ondemand.c | 263 ++++++++++++++++++++------------- include/linux/cpufreq.h | 17 ++- 6 files changed, 562 insertions(+), 276 deletions(-) -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

3
9
0 0

[PATCH] cpufreq: Documentation: Fix cpufreq_frequency_table name

by Viresh Kumar

At few places in documentation cpufreq_frequency_table is written as cpufreq_freq_table. Fix these. Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> --- Documentation/cpu-freq/cpu-drivers.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/cpu-freq/cpu-drivers.txt b/Documentation/cpu-freq/cpu-drivers.txt index c94383f..a3585ea 100644 --- a/Documentation/cpu-freq/cpu-drivers.txt +++ b/Documentation/cpu-freq/cpu-drivers.txt @@ -185,10 +185,10 @@ the reference implementation in drivers/cpufreq/longrun.c As most cpufreq processors only allow for being set to a few specific frequencies, a "frequency table" with some functions might assist in some work of the processor driver. Such a "frequency table" consists -of an array of struct cpufreq_freq_table entries, with any value in +of an array of struct cpufreq_frequency_table entries, with any value in "index" you want to use, and the corresponding frequency in "frequency". At the end of the table, you need to add a -cpufreq_freq_table entry with frequency set to CPUFREQ_TABLE_END. And +cpufreq_frequency_table entry with frequency set to CPUFREQ_TABLE_END. And if you want to skip one entry in the table, set the frequency to CPUFREQ_ENTRY_INVALID. The entries don't need to be in ascending order. -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

1
0
0 0

[PATCH V3 0/7] Create sched_select_cpu() and use it for workqueues

by Viresh Kumar

In order to save power, it would be useful to schedule light weight work on cpus that aren't IDLE instead of waking up an IDLE one. By idle cpu (from scheduler's perspective) we mean: - Current task is idle task - nr_running == 0 - wake_list is empty This is already implemented for timers as get_nohz_timer_target(). We can figure out few more users of this feature, like workqueues. This patchset converts get_nohz_timer_target() into a generic API sched_select_cpu() so that other frameworks (like workqueue) can also use it. This routine returns the cpu which is non-idle. It accepts a bitwise OR of SD_* flags present in linux/sched.h. If the local CPU isn't idle OR all cpus are idle, local cpu is returned back. If local cpu is idle, then we must look for another CPU which have all the flags passed as argument as set and isn't idle. This patchset in first two patches creates generic API sched_select_cpu(). In the third patch we create a new set of APIs for workqueues to queue work on any cpu. All other patches migrate some of the users of workqueues which showed up significantly on my setup. Others can be migrated later. Earlier discussions over v1 and v2 can be found here: http://www.mail-archive.com/linaro-dev@lists.linaro.org/msg13342.html http://lists.linaro.org/pipermail/linaro-dev/2012-November/014344.html Earlier discussions over this concept were done at last LPC: http://summit.linuxplumbersconf.org/lpc-2012/meeting/90/lpc2012-sched-timer… Setup: ----- - ARM Vexpress TC2 - big.LITTLE CPU - Core 0-1: A15, 2-4: A7 - rootfs: linaro-ubuntu-devel This patchset has been tested on a big LITTLE system (heterogeneous) but is useful for all other homogeneous systems as well. During these tests audio was played in background using aplay. Results: ------- Cluster A15 Energy Cluster A7 Energy Total ------------------ ----------------- ----- Without this patchset (Energy in Joules): --------------------- 0.151162 2.183545 2.334707 0.223730 2.687067 2.910797 0.289687 2.732702 3.022389 0.454198 2.745908 3.200106 0.495552 2.746465 3.242017 Average: 0.322866 2.619137 2.942003 With this patchset (Energy in Joules): --------------------- 0.133361 2.267822 2.401183 0.260626 2.833389 3.094015 0.142365 2.277929 2.420294 0.246793 2.582550 2.829343 0.130462 2.257033 2.387495 Average: 0.182721 2.443745 2.626466 Above tests are repeated multiple times and events are tracked using trace-cmd and analysed using kernelshark. And it was easily noticeable that idle time for many cpus has increased considerably, which eventually saved some power. These patches are applied here for others to test: http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/h… Viresh Kumar (7): sched: Create sched_select_cpu() to give preferred CPU for power saving timer: hrtimer: Don't check idle_cpu() before calling get_nohz_timer_target() workqueue: Add helpers to schedule work on any cpu PHYLIB: queue work on any cpu mmc: queue work on any cpu block: queue work on any cpu fbcon: queue work on any cpu block/blk-core.c | 6 +- block/blk-ioc.c | 2 +- block/genhd.c | 9 ++- drivers/mmc/core/core.c | 2 +- drivers/net/phy/phy.c | 9 +-- drivers/video/console/fbcon.c | 2 +- include/linux/sched.h | 21 +++++- include/linux/workqueue.h | 5 ++ kernel/hrtimer.c | 2 +- kernel/sched/core.c | 63 +++++++++------- kernel/timer.c | 2 +- kernel/workqueue.c | 163 +++++++++++++++++++++++++++++------------- 12 files changed, 192 insertions(+), 94 deletions(-) -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

8
34
0 0

[ACTIVITY] (John Stultz) March 25-29

by John Stultz

=== Highlights === * Sent out ntp locking change patches to lkml, didn't get any objections * Opened bug to track alarmdev unit test failures * Reviewed blueprints and had biweekly android upstreaming subteam hangout. * Provided DmitryP with instructions for submitting kernel changes to AOSP (he created a wiki with them - http://wiki.linaro.org/Process/PushingBitsToAndroid ) * Sent in expense reports for Linaro Connect. * Had some discussions with Jesse Barker and Serban about ION (both upstreaming and build issues for non-arm). * Had some discussions on the list about the future of drivers/clocksource maintenance * Worked with Appala (who was working very late nights) on some issues on the binder testing. * Sent Dmitry's sync compat_ioctl fixes to lkml/gregkh. Are queued for 3.10 * Sent tglx git pull request with my timekeeping changes for 3.10 * Sent Minchan my current work on making his volatile range patches more generic === Plans === * Focus on volatile range work in prep for lsf-mm * Still need to work on earlysuspend blog post * If tglx agrees, push timekeeping lock hold reductions to him. === Issues === * Caught a cold, so I've been a bit slow and foggy this week.

12 years, 2 months

1
0
0 0

Re: [Bug 55411] sysfs per-cpu cpufreq subdirs/symlinks screwed up after s2ram

by Viresh Kumar

Hi Guys, We are talking here about a bug reported by Duncan here. His cpu/cpu*/cpufreq directory are getting corrupted with 3.9-rc3 and was working well with 3.8 https://bugzilla.kernel.org/show_bug.cgi?id=55411 On his AMD bulldozer tri-cluster/6-core system he doesn't see affected and related cpus set correctly after off-lining 1-5 and bringing them back with: for i in 1 2 3 4 5; do echo 0 > /sys/devices/system/cpu/cpu$i/online ; done for i in 1 2 3 4 5; do echo 1 > /sys/devices/system/cpu/cpu$i/online ; done Before running above two, cpufreq-info gave: https://bugzilla.kernel.org/attachment.cgi?id=95701 And after running above it gave: https://bugzilla.kernel.org/attachment.cgi?id=95711 Clearly it got corrupted. Somehow cpu 3 showed up in related cpus field of cpu 5. I suspect following patches behind this: commit fcf8058296edbc3de43adf095824fc32b067b9f8 Author: Viresh Kumar <viresh.kumar(a)linaro.org> Date: Tue Jan 29 14:39:08 2013 +0000 cpufreq: Simplify cpufreq_add_dev() Currently cpufreq_add_dev() firsts allocates policy, calls driver->init() and then checks if this CPU is already managed or not. And if it is already managed, its policy is freed. We can save all this if we somehow know that CPU is managed or not in advance. policy->related_cpus contains the list of all valid sibling CPUs of policy->cpu. We can check this to see if the current CPU is already managed. From now on, platforms don't really need to set related_cpus from their init() routines, as the same work is done by core too. If a platform driver needs to set the related_cpus mask with some additional CPUs, other than CPUs present in policy->cpus, they are free to do it, though, as we don't override anything. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> Tested-by: Shawn Guo <shawn.guo(a)linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> AND commit 643ae6e81dd65b333a13259852405fc9f764ac76 Author: Viresh Kumar <viresh.kumar(a)linaro.org> Date: Sat Jan 12 05:14:38 2013 +0000 cpufreq: Manage only online cpus cpufreq core doesn't manage offline cpus and if driver->init() has returned mask including offline cpus, it may result in unwanted behavior by cpufreq core or governors. We need to get only online cpus in this mask. There are two places to fix this mask, cpufreq core and cpufreq driver. It makes sense to do this at common place and hence is done in core. Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> And this is the latest piece of documentation available: SMP systems normally have same clock source for a group of cpus. For these the .init() would be called only once for the first online cpu. Here the .init() routine must initialize policy->cpus with mask of all possible cpus (Online + Offline) that share the clock. Then the core would copy this mask onto policy->related_cpus and will reset policy->cpus to carry only online cpus. I saw acpi-cpufreq drivers driver->init() code and found it is not yet aligned to this theory and probably that is causing these failures. I don't have enough knowledge about this driver and how is it used for all x86 systems and so want somebody else (who has some prior experience with it) to check how policy->cpus and policy->related_cpus must be set from driver->init(). -- viresh ---------- Forwarded message ---------- From: <bugzilla-daemon(a)bugzilla.kernel.org> Date: 19 March 2013 13:19 Subject: [Bug 55411] sysfs per-cpu cpufreq subdirs/symlinks screwed up after s2ram To: viresh.kumar(a)linaro.org https://bugzilla.kernel.org/show_bug.cgi?id=55411 --- Comment #9 from Duncan <1i5t5.duncan(a)cox.net> 2013-03-19 07:49:53 --- (In reply to comment #8) > (In reply to comment #0) >> After a s2ram/resume cycle (now bad): >> >> /sys/devices/system/cpu/cpu0/cpufreq/ >> /sys/devices/system/cpu/cpu1/cpufreq -> ../cpu0/cpufreq/ >> /sys/devices/system/cpu/cpu3/cpufreq/ >> /sys/devices/system/cpu/cpu5/cpufreq/ > > Can you try this rather than s2r: > > for i in 1 2 3 4 5; do echo 0 > /sys/devices/system/cpu/cpu$i/online ; done > for i in 1 2 3 4 5; do echo 1 > /sys/devices/system/cpu/cpu$i/online ; done > > and check the status if things are still corrupted for you? > Above doesn't corrupt anything for me Atleast. That's a nice easy test; no rebuild and reboot needed. =:^) Tho I had to change the > to >| as I have bash noclobber set and the files obviously already exist... Uncorrupted before the test, corrupted after. So just cycling the cpus off and then back online *DOES* corrupt cpufreq, thus a much simpler reproducer! =:^) Exact same ls results as the above. > And my system doesn't have S2R support for now. My old system didn't support s2ram reliably; it would work occasionally but mostly it didn't. But s2disk was workable for awhile, until the fact that I was running mdraid and the disks didn't always return in the same sdX slots due to hardware wakeup issues complicated things, so eventually I didn't use that much either. The new system's great with s2ram, sans this bug of course; s2disk didn't work at all at first, but last time I tried it /almost/ worked so there has been improvement. But I don't like to take unnecessary chances with filesystem log replay and thankfully wall power's good enough around here that I can s2ram for a day and come back and wiggle the mouse and all's fine (with a couple pre-suspend syncs thrown into my script just in case), so I tend to use it a LOT, even more than I'd use s2disk due to the speed. =:^) But I'd love to have s2both working reliably; for all I know it's actually working now; it was pretty close. But I prefer not to test the reiserfs log replay (even with pre-suspend syncs I worry, tho as I said reiserfs has actually been very good to me even thru faulty ram, a power supply blowing up on me, a mobo dying, etc, since 2.6.16 or whenever it was that it got ordered journaling by default) when it doesn't work, so knowing s2disk didn't work well when I tested it and with s2ram working SO well, I don't tend to test s2disk/s2both too often. Meanwhile, thanks for the cpuinfo_cur_freq explanation. If that actually real-time touches the hardware to get the data as you say, that does explain the root privs. Maybe that bit of extra info could be added to the documentation? I could propose some new wording and open a new bug on cpu-freq/user-guide.txt for it if appropriate. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

12 years, 2 months

5
36
0 0

[PATCH v3] ARM: add support for context tracking subsystem

by Kevin Hilman

commit 91d1aa43 (context_tracking: New context tracking susbsystem) generalized parts of the RCU userspace extended quiescent state into the context tracking subsystem. Context tracking is then used to implement adaptive tickless (a.k.a extended nohz) To support the new context tracking subsystem on ARM, the user/kernel boundary transtions need to be instrumented. For exceptions and IRQs in usermode, the existing usr_entry macro is used to instrument the user->kernel transition. For the return to usermode path, the ret_to_user* path is instrumented. Using the usr_entry macro, this covers interrupts in userspace, data abort and prefetch abort exceptions in userspace as well as undefined exceptions in userspace (which is where FP emulation and VFP are handled.) For syscalls, the slow return path is covered by instrumenting the ret_to_user path. In addition, the syscall entry point is instrumented which covers the user->kernel transition for both fast and slow syscalls, and an additional instrumentation point is added for the fast syscall return path (ret_fast_syscall). Cc: Mats Liljegren <mats.liljegren(a)enea.com> Cc: Frederic Weisbecker <fweisbec(a)gmail.com> Signed-off-by: Kevin Hilman <khilman(a)linaro.org> --- Updates from v2: - optionally save/restore registers before calling user_enter/user_exit (suggested by Russell King) Updates from v1: - instrument entry/exit points directly in assembly, instead of C code - combined exceptions and syscalls into a single patch - covers VFP and FP emulation now (v1 limitation pointed out by Russell) Depends on the previously posted prerequistes series: [PATCH 0/3] ARM: context tracking support prerequisites http://marc.info/?l=linux-kernel&m=136382248131438&w=2 Both of which are combined on top of Frederic's 3.9-rc1-nohz1 branch and available here: git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux.git arm-nohz-v3/context-tracking arch/arm/Kconfig | 1 + arch/arm/include/asm/thread_info.h | 1 + arch/arm/kernel/entry-armv.S | 1 + arch/arm/kernel/entry-common.S | 3 +++ arch/arm/kernel/entry-header.S | 28 ++++++++++++++++++++++++++++ 5 files changed, 34 insertions(+) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index ba8bf89..0b13689 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -59,6 +59,7 @@ config ARM select OLD_SIGSUSPEND3 select OLD_SIGACTION select HAVE_VIRT_CPU_ACCOUNTING + select HAVE_CONTEXT_TRACKING help The ARM series is a line of low-power-consumption RISC chip designs licensed by ARM Ltd and targeted at embedded applications and diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h index cddda1f..1995d1a 100644 --- a/arch/arm/include/asm/thread_info.h +++ b/arch/arm/include/asm/thread_info.h @@ -152,6 +152,7 @@ extern int vfp_restore_user_hwstate(struct user_vfp __user *, #define TIF_SYSCALL_AUDIT 9 #define TIF_SYSCALL_TRACEPOINT 10 #define TIF_SECCOMP 11 /* seccomp syscall filtering active */ +#define TIF_NOHZ 12 /* in adaptive nohz mode */ #define TIF_USING_IWMMXT 17 #define TIF_MEMDIE 18 /* is terminating due to OOM killer */ #define TIF_RESTORE_SIGMASK 20 diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 0f82098..3449d30 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -396,6 +396,7 @@ ENDPROC(__pabt_svc) #ifdef CONFIG_IRQSOFF_TRACER bl trace_hardirqs_off #endif + ct_user_exit, save = 0 .endm .macro kuser_cmpxchg_check diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S index 3248cde..c8b42de 100644 --- a/arch/arm/kernel/entry-common.S +++ b/arch/arm/kernel/entry-common.S @@ -41,6 +41,7 @@ ret_fast_syscall: /* perform architecture specific actions before user return */ arch_ret_to_user r1, lr + ct_user_enter restore_user_regs fast = 1, offset = S_OFF UNWIND(.fnend ) @@ -76,6 +77,7 @@ no_work_pending: #endif /* perform architecture specific actions before user return */ arch_ret_to_user r1, lr + ct_user_enter, save = 0 restore_user_regs fast = 0, offset = 0 ENDPROC(ret_to_user_from_irq) @@ -394,6 +396,7 @@ ENTRY(vector_swi) mcr p15, 0, ip, c1, c0 @ update control register #endif enable_irq + ct_user_exit get_thread_info tsk adr tbl, sys_call_table @ load syscall table pointer diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S index 9a8531e..782a949 100644 --- a/arch/arm/kernel/entry-header.S +++ b/arch/arm/kernel/entry-header.S @@ -164,6 +164,34 @@ #endif /* !CONFIG_THUMB2_KERNEL */ /* + * Context tracking subsystem. Used to instrument transitions + * between user and kernel mode. + */ + .macro ct_user_exit, save = 1 +#ifdef CONFIG_CONTEXT_TRACKING + .if \save + stmdb sp!, {r0-r3, ip, lr} + bl user_exit + ldmia sp!, {r0-r3, ip, lr} + .else + bl user_exit + .endif +#endif + .endm + + .macro ct_user_enter, save = 1 +#ifdef CONFIG_CONTEXT_TRACKING + .if \save + stmdb sp!, {r0-r3, ip, lr} + bl user_enter + ldmia sp!, {r0-r3, ip, lr} + .else + bl user_enter + .endif +#endif + .endm + +/* * These are the registers used in the syscall handler, and allow us to * have in theory up to 7 arguments to a function - r0 to r6. * -- 1.8.2

12 years, 2 months

1
1
0 0

[PATCH v2] ARM: add support for context tracking subsystem

by Kevin Hilman

commit 91d1aa43 (context_tracking: New context tracking susbsystem) generalized parts of the RCU userspace extended quiescent state into the context tracking subsystem. Context tracking is then used to implement adaptive tickless (a.k.a extended nohz) To support the new context tracking subsystem on ARM, the user/kernel boundary transtions need to be instrumented. For exceptions and IRQs in usermode, the existing usr_entry macro is used to instrument the user->kernel transition. For the return to usermode path, the ret_to_user* path is instrumented. Using the usr_entry macro, this covers interrupts in userspace, data abort and prefetch abort exceptions in userspace as well as undefined exceptions in userspace (which is where FP emulation and VFP are handled.) For syscalls, the slow return path is covered by instrumenting the ret_to_user path. In addition, the syscall entry point is instrumented which covers the user->kernel transition for both fast and slow syscalls, and an additional instrumentation point is added for the fast syscall return path (ret_fast_syscall). Cc: Mats Liljegren <mats.liljegren(a)enea.com> Cc: Frederic Weisbecker <fweisbec(a)gmail.com> Signed-off-by: Kevin Hilman <khilman(a)linaro.org> --- Updates from v1: - instrument entry/exit points directly in assembly, instead of C code - combined exceptions and syscalls into a single patch - covers VFP and FP emulation now (v1 limitation pointed out by Russell) Depends on the previously posted prerequistes series: [PATCH 0/3] ARM: context tracking support prerequisites http://marc.info/?l=linux-kernel&m=136382248131438&w=2 Both of which are combined on top of Frederic's 3.9-rc1-nohz1 branch and available here: git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux.git arm-nohz-v2/context-tracking arch/arm/Kconfig | 1 + arch/arm/include/asm/thread_info.h | 1 + arch/arm/kernel/entry-armv.S | 1 + arch/arm/kernel/entry-common.S | 3 +++ arch/arm/kernel/entry-header.S | 20 ++++++++++++++++++++ 5 files changed, 26 insertions(+) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index ba8bf89..0b13689 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -59,6 +59,7 @@ config ARM select OLD_SIGSUSPEND3 select OLD_SIGACTION select HAVE_VIRT_CPU_ACCOUNTING + select HAVE_CONTEXT_TRACKING help The ARM series is a line of low-power-consumption RISC chip designs licensed by ARM Ltd and targeted at embedded applications and diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h index cddda1f..1995d1a 100644 --- a/arch/arm/include/asm/thread_info.h +++ b/arch/arm/include/asm/thread_info.h @@ -152,6 +152,7 @@ extern int vfp_restore_user_hwstate(struct user_vfp __user *, #define TIF_SYSCALL_AUDIT 9 #define TIF_SYSCALL_TRACEPOINT 10 #define TIF_SECCOMP 11 /* seccomp syscall filtering active */ +#define TIF_NOHZ 12 /* in adaptive nohz mode */ #define TIF_USING_IWMMXT 17 #define TIF_MEMDIE 18 /* is terminating due to OOM killer */ #define TIF_RESTORE_SIGMASK 20 diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 0f82098..1034d40 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -396,6 +396,7 @@ ENDPROC(__pabt_svc) #ifdef CONFIG_IRQSOFF_TRACER bl trace_hardirqs_off #endif + ct_user_exit .endm .macro kuser_cmpxchg_check diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S index 3248cde..5c2b27a 100644 --- a/arch/arm/kernel/entry-common.S +++ b/arch/arm/kernel/entry-common.S @@ -38,6 +38,7 @@ ret_fast_syscall: #if defined(CONFIG_IRQSOFF_TRACER) asm_trace_hardirqs_on #endif + ct_user_enter /* perform architecture specific actions before user return */ arch_ret_to_user r1, lr @@ -74,6 +75,7 @@ no_work_pending: #if defined(CONFIG_IRQSOFF_TRACER) asm_trace_hardirqs_on #endif + ct_user_enter /* perform architecture specific actions before user return */ arch_ret_to_user r1, lr @@ -394,6 +396,7 @@ ENTRY(vector_swi) mcr p15, 0, ip, c1, c0 @ update control register #endif enable_irq + ct_user_exit get_thread_info tsk adr tbl, sys_call_table @ load syscall table pointer diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S index 9a8531e..d65b86c 100644 --- a/arch/arm/kernel/entry-header.S +++ b/arch/arm/kernel/entry-header.S @@ -164,6 +164,26 @@ #endif /* !CONFIG_THUMB2_KERNEL */ /* + * Context tracking subsystem. Used to instrument transitions + * between user and kernel mode. + */ + .macro ct_user_exit +#ifdef CONFIG_CONTEXT_TRACKING + stmdb sp!, {r0-r3, ip, lr} + bl user_exit + ldmia sp!, {r0-r3, ip, lr} +#endif + .endm + + .macro ct_user_enter +#ifdef CONFIG_CONTEXT_TRACKING + stmdb sp!, {r0-r3, ip, lr} + bl user_enter + ldmia sp!, {r0-r3, ip, lr} +#endif + .endm + +/* * These are the registers used in the syscall handler, and allow us to * have in theory up to 7 arguments to a function - r0 to r6. * -- 1.8.2

12 years, 2 months

1
0
0 0

[PATCH 00/15] ARM: cpuidle: code consolidation

by Daniel Lezcano

The flag CPUIDLE_FLAG_TIMER_STOP has been introduced in the commit 89878baa73f0f1c679355006bd8632e5d78f96c2. The flag tells the cpuidle framework the local timer will stop in the idle state. It is now easy to know if the cpuidle driver will use or not the broadcast timer by looking at the different states for this flag and then setup the broadcast timer consequently. When we remove the timer initialization duplicated code in the different drivers, we have most of the drivers with the same init function. This init function is changed to be generic and moved in the ARM cpuidle driver and used from the drivers. That cleanups code and removes a lot of annoying duplicated code. There is still some modification in OMAP4, tegra2, tegra3 and imx, especially around the coupled idle states, but we are more and more closer to a common squeleton for all the ARM drivers. Daniel Lezcano (15): timer: move enum definition out of ifdef section cpuidle: initialize the broadcast timer framework cpuidle: ux500: remove timer broadcast initialization cpuidle: OMAP4: remove timer broadcast initialization cpuidle: imx6: remove timer broadcast initialization ARM: cpuidle: remove useless declaration ARM: cpuidle: add init/exit routine ARM: ux500: cpuidle: use init/exit common routine ARM: at91: cpuidle: use init/exit common routine ARM: OMAP3: cpuidle: use init/exit common routine ARM: s3c64xx: cpuidle: use init/exit common routine ARM: tegra1: cpuidle: use init/exit common routine ARM: shmobile: pm: fix init sections ARM: shmobile: cpuidle: remove useless WFI function ARM: shmobile: cpuidle: use init/exit common routine arch/arm/include/asm/cpuidle.h | 11 +++--- arch/arm/kernel/cpuidle.c | 57 +++++++++++++++++++++++++++++++- arch/arm/mach-at91/cpuidle.c | 17 ++-------- arch/arm/mach-imx/cpuidle-imx6q.c | 15 --------- arch/arm/mach-omap2/cpuidle34xx.c | 18 ++-------- arch/arm/mach-omap2/cpuidle44xx.c | 14 -------- arch/arm/mach-s3c64xx/cpuidle.c | 15 ++------- arch/arm/mach-shmobile/cpuidle.c | 22 ++---------- arch/arm/mach-shmobile/pm-sh7372.c | 4 +-- arch/arm/mach-tegra/cpuidle-tegra114.c | 27 +-------------- arch/arm/mach-ux500/cpuidle.c | 50 +--------------------------- drivers/cpuidle/driver.c | 35 ++++++++++++++++++-- include/linux/clockchips.h | 22 ++++++------ include/linux/cpuidle.h | 2 ++ 14 files changed, 120 insertions(+), 189 deletions(-) -- 1.7.9.5

12 years, 2 months

11
44
0 0

[PATCH V3 0/4] CPUFreq: Implement per policy instances of governors

by Viresh Kumar

This is targetted for 3.10-rc1 or linux-next just after the merge window. All patches are pushed here for others to apply: http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/h… Currently, there can't be multiple instances of single governor_type. If we have a multi-package system, where we have multiple instances of struct policy (per package), we can't have multiple instances of same governor. i.e. We can't have multiple instances of ondemand governor for multiple packages. Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/ governor-name/. Which again reflects that there can be only one instance of a governor_type in the system. This is a bottleneck for multicluster system, where we want different packages to use same governor type, but with different tunables. This patchset is inclined towards fixing this issue. Now we will create governors directory in cpu/cpu*/cpufreq/<gov> for platforms which have multiple struct policy alive at any moment. For others the interface is kept same: cpu/cpufreq/<gov>. @Rafael: Clearly, I don't want to have following patch: "cpufreq: Add Kconfig option to enable/disable have_multiple_policies" and added it because of comment from Borislov against which nobody else replied :) So, please drop it if you agree over my comments with earlier version. V2->V3: - Fixed value of CPUFREQ_GOV_POLICY_EXIT in the correct patch - Drop indentation fixes from intel_pstate.c V1->V2: - Few patches from V1 are already picked up by Rafael for 3.9-rc1 - Last two patches are new - Added dbs_data->exit() routines to free up memory used for struct tuners. Viresh Kumar (4): cpufreq: Add per policy governor-init/exit infrastructure cpufreq: governor: Implement per policy instances of governors cpufreq: Get rid of "struct global_attr" cpufreq: Add Kconfig option to enable/disable have_multiple_policies drivers/cpufreq/Kconfig | 3 + drivers/cpufreq/acpi-cpufreq.c | 9 +- drivers/cpufreq/cpufreq.c | 27 +++-- drivers/cpufreq/cpufreq_conservative.c | 148 +++++++++++++--------- drivers/cpufreq/cpufreq_governor.c | 159 ++++++++++++++---------- drivers/cpufreq/cpufreq_governor.h | 43 +++++-- drivers/cpufreq/cpufreq_ondemand.c | 216 +++++++++++++++++++-------------- drivers/cpufreq/intel_pstate.c | 20 +-- include/linux/cpufreq.h | 44 ++++--- 9 files changed, 397 insertions(+), 272 deletions(-) -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

3
25
0 0

[PATCH 1/2] cpufreq: drivers: don't check range of target freq in .target()

by Viresh Kumar

Cpufreq core checks the range of target_freq before calling driver->target() and so we don't need to do it again. Remove it. Cc: Sekhar Nori <nsekhar(a)ti.com> Cc: Linus Walleij <linus.walleij(a)linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> --- arch/arm/mach-davinci/cpufreq.c | 9 --------- drivers/cpufreq/dbx500-cpufreq.c | 6 ------ 2 files changed, 15 deletions(-) diff --git a/arch/arm/mach-davinci/cpufreq.c b/arch/arm/mach-davinci/cpufreq.c index 55eb870..8fb0c2a 100644 --- a/arch/arm/mach-davinci/cpufreq.c +++ b/arch/arm/mach-davinci/cpufreq.c @@ -79,15 +79,6 @@ static int davinci_target(struct cpufreq_policy *policy, struct davinci_cpufreq_config *pdata = cpufreq.dev->platform_data; struct clk *armclk = cpufreq.armclk; - /* - * Ensure desired rate is within allowed range. Some govenors - * (ondemand) will just pass target_freq=0 to get the minimum. - */ - if (target_freq < policy->cpuinfo.min_freq) - target_freq = policy->cpuinfo.min_freq; - if (target_freq > policy->cpuinfo.max_freq) - target_freq = policy->cpuinfo.max_freq; - freqs.old = davinci_getspeed(0); freqs.new = clk_round_rate(armclk, target_freq * 1000) / 1000; diff --git a/drivers/cpufreq/dbx500-cpufreq.c b/drivers/cpufreq/dbx500-cpufreq.c index 7192a6d..15ed367 100644 --- a/drivers/cpufreq/dbx500-cpufreq.c +++ b/drivers/cpufreq/dbx500-cpufreq.c @@ -37,12 +37,6 @@ static int dbx500_cpufreq_target(struct cpufreq_policy *policy, unsigned int idx; int ret; - /* scale the target frequency to one of the extremes supported */ - if (target_freq < policy->cpuinfo.min_freq) - target_freq = policy->cpuinfo.min_freq; - if (target_freq > policy->cpuinfo.max_freq) - target_freq = policy->cpuinfo.max_freq; - /* Lookup the next frequency */ if (cpufreq_frequency_table_target(policy, freq_table, target_freq, relation, &idx)) -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

3
7
0 0

[PATCH v2] drm/exynos: enable FIMD clocks

by Vikas Sajjan

While migrating to common clock framework (CCF), found that the FIMD clocks were pulled down by the CCF. If CCF finds any clock(s) which has NOT been claimed by any of the drivers, then such clock(s) are PULLed low by CCF. By calling clk_prepare_enable() for FIMD clocks fixes the issue. this patch also replaces clk_disable() with clk_disable_unprepare() during exit. Signed-off-by: Vikas Sajjan <vikas.sajjan(a)linaro.org> --- Changes since v1: - added error checking for clk_prepare_enable() and also replaced clk_disable() with clk_disable_unprepare() during exit. --- drivers/gpu/drm/exynos/exynos_drm_fimd.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c b/drivers/gpu/drm/exynos/exynos_drm_fimd.c index 9537761..014d750 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c @@ -934,6 +934,19 @@ static int fimd_probe(struct platform_device *pdev) return ret; } + ret = clk_prepare_enable(ctx->lcd_clk); + if (ret) { + dev_err(dev, "failed to enable 'sclk_fimd' clock\n"); + return ret; + } + + ret = clk_prepare_enable(ctx->bus_clk); + if (ret) { + clk_disable_unprepare(ctx->lcd_clk); + dev_err(dev, "failed to enable 'fimd' clock\n"); + return ret; + } + ctx->vidcon0 = pdata->vidcon0; ctx->vidcon1 = pdata->vidcon1; ctx->default_win = pdata->default_win; @@ -981,8 +994,8 @@ static int fimd_remove(struct platform_device *pdev) if (ctx->suspended) goto out; - clk_disable(ctx->lcd_clk); - clk_disable(ctx->bus_clk); + clk_disable_unprepare(ctx->lcd_clk); + clk_disable_unprepare(ctx->bus_clk); pm_runtime_set_suspended(dev); pm_runtime_put_sync(dev); -- 1.7.9.5

12 years, 2 months

2
2
0 0

[PATCH v2 0/2] dma-buf: Add support for debugfs

by Sumit Semwal

The patch series adds a much-missed support for debugfs to dma-buf framework. Based on the feedback received on v1 of this patch series, support is also added to allow exporters to provide name-strings that will prove useful while debugging. Some more magic can be added for more advanced debugging, but we'll leave that for the time being. Best regards, ~Sumit. Sumit Semwal (2): dma-buf: replace dma_buf_export() with dma_buf_export_named() dma-buf: Add debugfs support Documentation/dma-buf-sharing.txt | 13 ++- drivers/base/dma-buf.c | 173 ++++++++++++++++++++++++++++++++++++- include/linux/dma-buf.h | 16 +++- 3 files changed, 193 insertions(+), 9 deletions(-) -- 1.7.10.4

12 years, 2 months

3
5
0 0

[PATCH] cpufreq: cpu0: Fix mistake in Documentation example

by Viresh Kumar

"clock-latency" is incorrectly written as "transition-latency" in an example present in Documentation of cpufreq-cpu0. Fix it. Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> --- Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt b/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt index 4416ccc..051f764 100644 --- a/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt +++ b/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt @@ -32,7 +32,7 @@ cpus { 396000 950000 198000 850000 >; - transition-latency = <61036>; /* two CLK32 periods */ + clock-latency = <61036>; /* two CLK32 periods */ }; cpu@1 { -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

3
3
0 0

[PATCH v6 0/5] Add ST-Ericsson AB8500 HWMON driver

by Hongbo Zhang

Guenter, Please check this v6 patches, thanks for all your reviews/comments for this patch set. Anton Vorontsov, for this v5 and v6 patches: I have add your Acked-by: into patch [1/5] (which was [1/3]); and split the old [2/3] into new [2/5], [3/5] and [4/5] patches, so please have a look them again, thank you. v5 -> v6 changes: - add depend on AB8500_BM in Kconfig - fix wrong usage of clamp_val() - export symbols for module compiling v4 -> v5 changes: - split the old [2/3]-ab8500-re-arrange-ab8500-power-and-temperature-data into new three [2/5], [3/5] and [4/5] patches. - hwmon driver minor coding style clean ups: - {} usage in if-else statement in ab8500_read_sensor function - index error fix in gpadc_monitor function - fix issue of clamp_val() usage - remove unnecessary else in function abx500_attrs_visible - remove redundant print message about irq set up - return the calling function return value directly in probe function v3 -> v4 changes: for patch [3/3] - define delays in HZ - update ab8500_read_sensor function, returning temp by parameter - remove ab8500_is_visible function - use clamp_val in set_min and set_max callback - remove unnecessary locks in remove and suspend functions - let abx500 and ab8500 use its own data structure for patch [2/3] - move the data tables from driver/power/ab8500_bmdata.c to include/linux/power/ab8500.h - rename driver/power/ab8500_bmdata.c to driver/power/ab8500_bm.c - rename these variable names to eliminate CamelCase warnings - add const attribute to these data v2 -> v3 changes: - Add interface for converting voltage to temperature - Remove temp5 sensor since we cannot offer temperature read interface of it - Update hyst to use absolute temperature instead of a difference - Add the 3/3 patch v1 -> v2 changes: - Add Documentation/hwmon/abx500 and Documentation/hwmon/abx500 - Make devices which cannot report milli-Celsius invisible - Add temp5_crit interface - Re-work the old find_active_thresholds() to threshold_updated() - Reset updated_min_alarm and updated_max_alarm at the end of each loop - Update the hyst mechamisn to make it works as real hyst - Remove non-stand attributes - Re-order the operations sequence inside probe and remove functions - Update all the lock usages to eliminate race conditions - Make attibutes index starts from 0 also changes: - Since the old [1/2] "ARM: ux500: rename ab8500 to abx500 for hwmon driver" has been merged by Samuel, so won't send it again. - Add another new patch "ab8500_btemp: export two symblols" as [2/2] of this patch set. Hongbo Zhang (5): ab8500_btemp: make ab8500_btemp_get* interfaces public ab8500: power: eliminate CamelCase warning of some variables ab8500: power: add const attributes to some data arrays ab8500: power: export abx500_res_to_temp tables for hwmon hwmon: add ST-Ericsson ABX500 hwmon driver Documentation/hwmon/ab8500 | 22 ++ Documentation/hwmon/abx500 | 28 ++ drivers/hwmon/Kconfig | 13 + drivers/hwmon/Makefile | 1 + drivers/hwmon/ab8500.c | 206 +++++++++++++++ drivers/hwmon/abx500.c | 491 +++++++++++++++++++++++++++++++++++ drivers/hwmon/abx500.h | 69 +++++ drivers/power/ab8500_bmdata.c | 42 +-- drivers/power/ab8500_btemp.c | 5 +- drivers/power/ab8500_fg.c | 4 +- include/linux/mfd/abx500.h | 6 +- include/linux/mfd/abx500/ab8500-bm.h | 5 + include/linux/power/ab8500.h | 16 ++ 13 files changed, 885 insertions(+), 23 deletions(-) create mode 100644 Documentation/hwmon/ab8500 create mode 100644 Documentation/hwmon/abx500 create mode 100644 drivers/hwmon/ab8500.c create mode 100644 drivers/hwmon/abx500.c create mode 100644 drivers/hwmon/abx500.h create mode 100644 include/linux/power/ab8500.h -- 1.8.0

12 years, 2 months

2
7
0 0

[RFC patch 00/11] cpuidle : ARM driver to rule them all

by Daniel Lezcano

At the Linaro Connect Asia 2013, a status of the different cpuidle drivers available upstream have been presented [1]. It was statued there is a lot of common code, especially in the init routine, and code duplication (eg. ux500 vs imx6). The following patchset is the first stone to a single ARM driver consolidating all the common routine used in the different drivers. The patchset has been tested on ux500 and at91, compiled on all the other platforms. [1] https://lca-13.zerista.com/event/member/72362 Daniel Lezcano (11): cpuidle : handle clockevent notify from the cpuidle framework cpuidle / arm : a single cpuidle driver cpuidle / ux500 : use common ARM cpuidle driver cpuidle / omap3 : use common ARM cpuidle driver cpuidle / davinci : use common ARM driver cpuidle / at91 : use common ARM cpuidle driver cpuidle / shmobile : use common ARM cpuidle driver cpuidle / imx : use common ARM cpuidle driver cpuidle / s3c64xx : use common ARM cpuidle driver cpuidle / calxeda : use common ARM cpuidle driver cpuidle / kirkwood : use common ARM cpuidle driver MAINTAINERS | 6 ++ arch/arm/include/asm/cpuidle.h | 3 + arch/arm/mach-at91/cpuidle.c | 15 +---- arch/arm/mach-davinci/cpuidle.c | 20 +------ arch/arm/mach-imx/Makefile | 1 - arch/arm/mach-imx/cpuidle-imx6q.c | 18 +----- arch/arm/mach-imx/cpuidle.c | 80 -------------------------- arch/arm/mach-imx/cpuidle.h | 6 +- arch/arm/mach-imx/pm-imx5.c | 3 +- arch/arm/mach-omap2/cpuidle34xx.c | 18 +----- arch/arm/mach-s3c64xx/cpuidle.c | 15 +---- arch/arm/mach-shmobile/cpuidle.c | 10 +--- arch/arm/mach-ux500/cpuidle.c | 56 +----------------- drivers/cpuidle/Makefile | 1 + drivers/cpuidle/arm-idle.c | 112 ++++++++++++++++++++++++++++++++++++ drivers/cpuidle/cpuidle-calxeda.c | 52 +---------------- drivers/cpuidle/cpuidle-kirkwood.c | 17 +----- drivers/cpuidle/cpuidle.c | 9 +++ include/linux/cpuidle.h | 1 + 19 files changed, 149 insertions(+), 294 deletions(-) delete mode 100644 arch/arm/mach-imx/cpuidle.c create mode 100644 drivers/cpuidle/arm-idle.c -- 1.7.9.5

12 years, 2 months

6
20
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

linaro-kernel