=== Highlights ===
* Summarized the volatile ranges discussion I ran at lsf-mm:
http://permalink.gmane.org/gmane.linux.kernel.mm/98848
* The lsf-mm volatile ranges discussion was briefly covered by lwn:
https://lwn.net/Articles/548108/
* Reviewed DmitryP's netfilter idletimer patches
* Met with Zach and Karim for LPC Android minisummit planning
* Reviewed blueprints and held bi-weekly upstreaming hangout
* Discussed RTC vs persistent_clock confusion and issues on lkml
* Worked with Zoran on suspend/resume issue & general git/community
process stuff.
* Discussed DmitryP's thought of using Gerrit for Linaro test development
* Updated linaro.android tree to AOSP's -rc7 branch, but reverted when
Tixy saw some issues
* Worked with Tixy to get his cpufreq fix integrated into the
linaro-fixes branch and pushed upstream to ASOP
* Discussed ION build issues w/ Jessee Barker
* Worked on rebasing and reworking Minchan and my volatile ranges
patches so they are more coherant and unified.
=== Plans ===
* Continue reworking the volatile ranges patchset and send to lkml
* Review tglx's clocksource unregister patches
* More LPC minisummit planning
* Probably more ION research
=== Issues ===
* NA
With prior discussions (Over private emails) with current Maintainer of cpufreq
framework (Rafael), I am adding myself as a co-maintainer of cpufreq framework.
This would mostly be for cpufreq core and ARM drivers but not restricted to
them.
This also adds path of the git tree where cpufreq patches are pulled in.
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
V1->V2:
- Added path of git tree too.
- Cc'd ARM SoC Maintainers.
MAINTAINERS | 2 ++
1 file changed, 2 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 68d376e..cbed63c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2211,9 +2211,11 @@ F: drivers/net/ethernet/ti/cpmac.c
CPU FREQUENCY DRIVERS
M: Rafael J. Wysocki <rjw(a)sisk.pl>
+M: Viresh Kumar <viresh.kumar(a)linaro.org>
L: cpufreq(a)vger.kernel.org
L: linux-pm(a)vger.kernel.org
S: Maintained
+T: git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
F: drivers/cpufreq/
F: include/linux/cpufreq.h
--
1.7.12.rc2.18.g61b472e
Currently the cpuidle drivers are spread across the different archs.
The patch submission for cpuidle are following different path: the cpuidle core
code goes to linux-pm, the ARM drivers goes to arm-soc or the SoC specific
tree, sh goes through sh arch tree, pseries goes through PowerPC and
finally intel goes through Len's tree while acpi_idle goes under linux-pm.
That makes difficult to consolidate the code and to propagate modifications
from the cpuidle core to the different drivers.
Hopefully, a movement has initiated to put the cpuidle drivers into the
drivers/cpuidle directory like cpuidle-calxeda.c and cpuidle-kirkwood.c
Add an explicit maintainer entry in the MAINTAINER to clarify the situation
and prevent new cpuidle drivers to goes to an arch directory.
The upstreaming process is unchanged: Rafael takes the patches to merge them
into its tree but with the acked-by from the driver's maintainer. So the header
must contains the name of the maintainer.
This organization will be the same than cpufreq.
Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
Acked-by: Linus Walleij <linus.walleij(a)linaro.org>
Acked-by: Andrew Lunn <andrew(a)lunn.ch> #for kirkwood
Acked-by: Jason Cooper <jason(a)lakedaemon.net> #for kirkwood
---
MAINTAINERS | 9 +++++++++
drivers/cpuidle/cpuidle-calxeda.c | 4 +++-
drivers/cpuidle/cpuidle-kirkwood.c | 5 +++--
3 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 61677c3..45ee6dc 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2217,6 +2217,15 @@ F: drivers/cpufreq/arm_big_little.h
F: drivers/cpufreq/arm_big_little.c
F: drivers/cpufreq/arm_big_little_dt.c
+CPUIDLE DRIVERS
+M: Rafael J. Wysocki <rjw(a)sisk.pl>
+M: Daniel Lezcano <daniel.lezcano(a)linaro.org>
+L: linux-pm(a)vger.kernel.org
+S: Maintained
+T: git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
+F: drivers/cpuidle/*
+F: include/linux/cpuidle.h
+
CPUID/MSR DRIVER
M: "H. Peter Anvin" <hpa(a)zytor.com>
S: Maintained
diff --git a/drivers/cpuidle/cpuidle-calxeda.c b/drivers/cpuidle/cpuidle-calxeda.c
index e344b56..2233791 100644
--- a/drivers/cpuidle/cpuidle-calxeda.c
+++ b/drivers/cpuidle/cpuidle-calxeda.c
@@ -1,7 +1,7 @@
/*
* Copyright 2012 Calxeda, Inc.
*
- * Based on arch/arm/plat-mxc/cpuidle.c:
+ * Based on arch/arm/plat-mxc/cpuidle.c: #v3.7
* Copyright 2012 Freescale Semiconductor, Inc.
* Copyright 2012 Linaro Ltd.
*
@@ -16,6 +16,8 @@
*
* You should have received a copy of the GNU General Public License along with
* this program. If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Maintainer: Rob Herring <rob.herring(a)calxeda.com>
*/
#include <linux/cpuidle.h>
diff --git a/drivers/cpuidle/cpuidle-kirkwood.c b/drivers/cpuidle/cpuidle-kirkwood.c
index 53290e1..521b0a7 100644
--- a/drivers/cpuidle/cpuidle-kirkwood.c
+++ b/drivers/cpuidle/cpuidle-kirkwood.c
@@ -1,6 +1,4 @@
/*
- * arch/arm/mach-kirkwood/cpuidle.c
- *
* CPU idle Marvell Kirkwood SoCs
*
* This file is licensed under the terms of the GNU General Public
@@ -11,6 +9,9 @@
* to implement two idle states -
* #1 wait-for-interrupt
* #2 wait-for-interrupt and DDR self refresh
+ *
+ * Maintainer: Jason Cooper <jason(a)lakedaemon.net>
+ * Maintainer: Andrew Lunn <andrew(a)lunn.ch>
*/
#include <linux/kernel.h>
--
1.7.9.5
With prior discussions (Over private emails) with current Maintainer of cpufreq
framework (Rafael), I am adding myself as a co-maintainer of cpufreq framework.
This would mostly be for cpufreq core and ARM drivers but not restricted to
them.
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 68d376e..bcef513 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2211,6 +2211,7 @@ F: drivers/net/ethernet/ti/cpmac.c
CPU FREQUENCY DRIVERS
M: Rafael J. Wysocki <rjw(a)sisk.pl>
+M: Viresh Kumar <viresh.kumar(a)linaro.org>
L: cpufreq(a)vger.kernel.org
L: linux-pm(a)vger.kernel.org
S: Maintained
--
1.7.12.rc2.18.g61b472e
Commit bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 brought the multiple driver
support. The code added a couple of new API to register the driver per cpu.
That led to some code complexity to handle the kernel config options when
the multiple driver support is enabled or not, which is not really necessary.
The code has to be compatible when the multiple driver support is not enabled,
and the multiple driver support has to be compatible with the old api.
This patch removes this API, which is not yet used by any driver but needed
for the HMP cpuidle drivers which will come soon, and replaces its usage
by a cpumask pointer in the cpuidle driver structure telling what cpus are
handled by the driver. That let the API cpuidle_[un]register_driver to be used
for the multipled driver support.
The current code, a bit poor in comments, has been commented and simplified.
Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
---
drivers/cpuidle/driver.c | 325 ++++++++++++++++++++++++++++------------------
include/linux/cpuidle.h | 21 +--
2 files changed, 212 insertions(+), 134 deletions(-)
diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
index 8dfaaae..2db96b5 100644
--- a/drivers/cpuidle/driver.c
+++ b/drivers/cpuidle/driver.c
@@ -18,206 +18,267 @@
DEFINE_SPINLOCK(cpuidle_driver_lock);
-static void __cpuidle_set_cpu_driver(struct cpuidle_driver *drv, int cpu);
-static struct cpuidle_driver * __cpuidle_get_cpu_driver(int cpu);
+#ifdef CONFIG_CPU_IDLE_MULTIPLE_DRIVERS
-static void cpuidle_setup_broadcast_timer(void *arg)
+static DEFINE_PER_CPU(struct cpuidle_driver *, cpuidle_drivers);
+
+/**
+ * __cpuidle_get_cpu_driver: returns the cpuidle driver tied with the specified
+ * cpu.
+ *
+ * @cpu: an integer specifying the cpu number
+ *
+ * Returns a pointer to struct cpuidle_driver, NULL if no driver has been
+ * registered for this driver
+ */
+static struct cpuidle_driver *__cpuidle_get_cpu_driver(int cpu)
{
- int cpu = smp_processor_id();
- clockevents_notify((long)(arg), &cpu);
+ return per_cpu(cpuidle_drivers, cpu);
}
-static void __cpuidle_driver_init(struct cpuidle_driver *drv, int cpu)
+/**
+ * __cpuidle_set_driver: assign to the per cpu variable the driver pointer for
+ * each cpu the driver is assigned to with the cpumask.
+ *
+ * @drv: a pointer to a struct cpuidle_driver
+ *
+ * Returns 0 on success, < 0 otherwise
+ */
+static inline int __cpuidle_set_driver(struct cpuidle_driver *drv)
{
- int i;
+ int cpu;
- drv->refcnt = 0;
+ for_each_cpu(cpu, drv->cpumask) {
- for (i = drv->state_count - 1; i >= 0 ; i--) {
+ if (__cpuidle_get_cpu_driver(cpu))
+ return -EBUSY;
- if (!(drv->states[i].flags & CPUIDLE_FLAG_TIMER_STOP))
- continue;
-
- drv->bctimer = 1;
- on_each_cpu_mask(get_cpu_mask(cpu), cpuidle_setup_broadcast_timer,
- (void *)CLOCK_EVT_NOTIFY_BROADCAST_ON, 1);
- break;
+ per_cpu(cpuidle_drivers, cpu) = drv;
}
+
+ return 0;
}
-static int __cpuidle_register_driver(struct cpuidle_driver *drv, int cpu)
+/**
+ * __cpuidle_unset_driver: for each cpu the driver is handling, set the per cpu
+ * variable driver to NULL.
+ *
+ * @drv: a pointer to a struct cpuidle_driver
+ */
+static inline void __cpuidle_unset_driver(struct cpuidle_driver *drv)
{
- if (!drv || !drv->state_count)
- return -EINVAL;
-
- if (cpuidle_disabled())
- return -ENODEV;
-
- if (__cpuidle_get_cpu_driver(cpu))
- return -EBUSY;
+ int cpu;
- __cpuidle_driver_init(drv, cpu);
+ for_each_cpu(cpu, drv->cpumask) {
- __cpuidle_set_cpu_driver(drv, cpu);
+ if (drv != __cpuidle_get_cpu_driver(cpu))
+ continue;
- return 0;
+ per_cpu(cpuidle_drivers, cpu) = NULL;
+ }
}
-static void __cpuidle_unregister_driver(struct cpuidle_driver *drv, int cpu)
-{
- if (drv != __cpuidle_get_cpu_driver(cpu))
- return;
+#else
- if (!WARN_ON(drv->refcnt > 0))
- __cpuidle_set_cpu_driver(NULL, cpu);
+static struct cpuidle_driver *cpuidle_curr_driver;
- if (drv->bctimer) {
- drv->bctimer = 0;
- on_each_cpu_mask(get_cpu_mask(cpu), cpuidle_setup_broadcast_timer,
- (void *)CLOCK_EVT_NOTIFY_BROADCAST_OFF, 1);
- }
+/**
+ * __cpuidle_get_cpu_driver: returns the global cpuidle driver pointer.
+ *
+ * @cpu: an integer specifying the cpu number, this parameter is ignored
+ *
+ * Returns a pointer to a struct cpuidle_driver, NULL if no driver was
+ * previously registered
+ */
+static inline struct cpuidle_driver *__cpuidle_get_cpu_driver(int cpu)
+{
+ return cpuidle_curr_driver;
}
-#ifdef CONFIG_CPU_IDLE_MULTIPLE_DRIVERS
+/**
+ * __cpuidle_set_driver: assign the cpuidle driver pointer to the global cpuidle
+ * driver variable.
+ *
+ * @drv: a pointer to a struct cpuidle_driver
+ *
+ * Returns 0 on success, < 0 otherwise
+ */
+static inline int __cpuidle_set_driver(struct cpuidle_driver *drv)
+{
+ if (cpuidle_curr_driver)
+ return -EBUSY;
-static DEFINE_PER_CPU(struct cpuidle_driver *, cpuidle_drivers);
+ cpuidle_curr_driver = drv;
-static void __cpuidle_set_cpu_driver(struct cpuidle_driver *drv, int cpu)
-{
- per_cpu(cpuidle_drivers, cpu) = drv;
+ return 0;
}
-static struct cpuidle_driver *__cpuidle_get_cpu_driver(int cpu)
+/**
+ * __cpuidle_unset_driver: reset the global cpuidle driver variable if the
+ * cpuidle driver pointer match it.
+ *
+ * @drv: a pointer to a struct cpuidle_driver
+ */
+static inline void __cpuidle_unset_driver(struct cpuidle_driver *drv)
{
- return per_cpu(cpuidle_drivers, cpu);
+ if (drv == cpuidle_curr_driver)
+ cpuidle_curr_driver = NULL;
}
-static void __cpuidle_unregister_all_cpu_driver(struct cpuidle_driver *drv)
+#endif
+
+/**
+ * cpuidle_setup_broadcast_timer: set the broadcast timer notification for the
+ * current cpu. This function is called per cpu context invoked by a smp cross
+ * call. It is not supposed to be called directly.
+ *
+ * @arg: a void pointer, actually used to match the smp cross call api but used
+ * as a long with two values:
+ * - CLOCK_EVT_NOTIFY_BROADCAST_ON
+ * - CLOCK_EVT_NOTIFY_BROADCAST_OFF
+ */
+static void cpuidle_setup_broadcast_timer(void *arg)
{
- int cpu;
- for_each_present_cpu(cpu)
- __cpuidle_unregister_driver(drv, cpu);
+ int cpu = smp_processor_id();
+ clockevents_notify((long)(arg), &cpu);
}
-static int __cpuidle_register_all_cpu_driver(struct cpuidle_driver *drv)
+/**
+ * __cpuidle_driver_init: initialize the driver internal data.
+ *
+ * @drv: a valid pointer to a struct cpuidle_driver
+ *
+ * Returns 0 on success, < 0 otherwise
+ */
+static int __cpuidle_driver_init(struct cpuidle_driver *drv)
{
- int ret = 0;
- int i, cpu;
+ int i;
- for_each_present_cpu(cpu) {
- ret = __cpuidle_register_driver(drv, cpu);
- if (ret)
- break;
- }
+ drv->refcnt = 0;
- if (ret)
- for_each_present_cpu(i) {
- if (i == cpu)
- break;
- __cpuidle_unregister_driver(drv, i);
- }
+ /*
+ * we default here to all cpu possible because if the kernel
+ * boots with some cpus offline and then we online one of them
+ * the cpu notifier won't know which driver to assign
+ */
+ if (!drv->cpumask)
+ drv->cpumask = cpu_possible_mask;
+
+ /*
+ * we look for the timer stop flag in the different states,
+ * so know we have to setup the broadcast timer. The loop is
+ * in reverse order, because usually the deeper state has this
+ * flag set
+ */
+ for (i = drv->state_count - 1; i >= 0 ; i--) {
+ if (!(drv->states[i].flags & CPUIDLE_FLAG_TIMER_STOP))
+ continue;
- return ret;
+ drv->bctimer = 1;
+ break;
+ }
+
+ return 0;
}
-int cpuidle_register_cpu_driver(struct cpuidle_driver *drv, int cpu)
+/**
+ * __cpuidle_register_driver: do some sanity checks, initializes the driver,
+ * assign the driver to the global cpuidle driver variable(s) and setup the
+ * broadcast timer if the cpuidle driver has some states which shutdown the
+ * local timer.
+ *
+ * @drv: a valid pointer to a struct cpuidle_driver
+ *
+ * Returns 0 on success, < 0 otherwise
+ */
+static int __cpuidle_register_driver(struct cpuidle_driver *drv)
{
int ret;
- spin_lock(&cpuidle_driver_lock);
- ret = __cpuidle_register_driver(drv, cpu);
- spin_unlock(&cpuidle_driver_lock);
+ if (!drv || !drv->state_count)
+ return -EINVAL;
- return ret;
-}
+ if (cpuidle_disabled())
+ return -ENODEV;
-void cpuidle_unregister_cpu_driver(struct cpuidle_driver *drv, int cpu)
-{
- spin_lock(&cpuidle_driver_lock);
- __cpuidle_unregister_driver(drv, cpu);
- spin_unlock(&cpuidle_driver_lock);
-}
+ ret = __cpuidle_driver_init(drv);
+ if (ret)
+ return ret;
-/**
- * cpuidle_register_driver - registers a driver
- * @drv: the driver
- */
-int cpuidle_register_driver(struct cpuidle_driver *drv)
-{
- int ret;
+ ret = __cpuidle_set_driver(drv);
+ if (ret)
+ return ret;
- spin_lock(&cpuidle_driver_lock);
- ret = __cpuidle_register_all_cpu_driver(drv);
- spin_unlock(&cpuidle_driver_lock);
+ if (drv->bctimer)
+ on_each_cpu_mask(drv->cpumask, cpuidle_setup_broadcast_timer,
+ (void *)CLOCK_EVT_NOTIFY_BROADCAST_ON, 1);
- return ret;
+ return 0;
}
-EXPORT_SYMBOL_GPL(cpuidle_register_driver);
/**
- * cpuidle_unregister_driver - unregisters a driver
- * @drv: the driver
+ * __cpuidle_unregister_driver: checks the driver is no longer in use, reset the
+ * global cpuidle driver variable(s) and disable the timer broadcast
+ * notification mechanism if it was in use.
+ *
+ * @drv: a valid pointer to a struct cpuidle_driver
+ *
+ * Returns 0 on success, < 0 otherwise
*/
-void cpuidle_unregister_driver(struct cpuidle_driver *drv)
+static void __cpuidle_unregister_driver(struct cpuidle_driver *drv)
{
- spin_lock(&cpuidle_driver_lock);
- __cpuidle_unregister_all_cpu_driver(drv);
- spin_unlock(&cpuidle_driver_lock);
-}
-EXPORT_SYMBOL_GPL(cpuidle_unregister_driver);
-
-#else
-
-static struct cpuidle_driver *cpuidle_curr_driver;
+ if (!WARN_ON(drv->refcnt > 0))
+ return;
-static inline void __cpuidle_set_cpu_driver(struct cpuidle_driver *drv, int cpu)
-{
- cpuidle_curr_driver = drv;
-}
+ __cpuidle_unset_driver(drv);
-static inline struct cpuidle_driver *__cpuidle_get_cpu_driver(int cpu)
-{
- return cpuidle_curr_driver;
+ if (drv->bctimer) {
+ drv->bctimer = 0;
+ on_each_cpu_mask(drv->cpumask, cpuidle_setup_broadcast_timer,
+ (void *)CLOCK_EVT_NOTIFY_BROADCAST_OFF, 1);
+ }
}
/**
- * cpuidle_register_driver - registers a driver
- * @drv: the driver
+ * cpuidle_register_driver: registers a driver by taking a lock to prevent
+ * multiple callers to [un]register a driver at the same time.
+ *
+ * @drv: a pointer to a valid struct cpuidle_driver
+ *
+ * Returns 0 on success, < 0 otherwise
*/
int cpuidle_register_driver(struct cpuidle_driver *drv)
{
- int ret, cpu;
+ int ret;
- cpu = get_cpu();
spin_lock(&cpuidle_driver_lock);
- ret = __cpuidle_register_driver(drv, cpu);
+ ret = __cpuidle_register_driver(drv);
spin_unlock(&cpuidle_driver_lock);
- put_cpu();
return ret;
}
EXPORT_SYMBOL_GPL(cpuidle_register_driver);
/**
- * cpuidle_unregister_driver - unregisters a driver
- * @drv: the driver
+ * cpuidle_unregister_driver: unregisters a driver by taking a lock to prevent
+ * multiple callers to [un]register a driver at the same time. The specified
+ * driver must match the driver currently registered.
+ *
+ * @drv: a pointer to a valid struct cpuidle_driver
*/
void cpuidle_unregister_driver(struct cpuidle_driver *drv)
{
- int cpu;
-
- cpu = get_cpu();
spin_lock(&cpuidle_driver_lock);
- __cpuidle_unregister_driver(drv, cpu);
+ __cpuidle_unregister_driver(drv);
spin_unlock(&cpuidle_driver_lock);
- put_cpu();
}
EXPORT_SYMBOL_GPL(cpuidle_unregister_driver);
-#endif
/**
- * cpuidle_get_driver - return the current driver
+ * cpuidle_get_driver: returns the driver tied with the current cpu.
+ *
+ * Returns a struct cpuidle_driver pointer, or NULL if no driver is registered
*/
struct cpuidle_driver *cpuidle_get_driver(void)
{
@@ -233,7 +294,12 @@ struct cpuidle_driver *cpuidle_get_driver(void)
EXPORT_SYMBOL_GPL(cpuidle_get_driver);
/**
- * cpuidle_get_cpu_driver - return the driver tied with a cpu
+ * cpuidle_get_cpu_driver: returns the driver registered with a cpu.
+ *
+ * @dev: a valid pointer to a struct cpuidle_device
+ *
+ * Returns a struct cpuidle_driver pointer, or NULL if no driver is registered
+ * for the specified cpu
*/
struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev)
{
@@ -244,6 +310,13 @@ struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev)
}
EXPORT_SYMBOL_GPL(cpuidle_get_cpu_driver);
+/**
+ * cpuidle_driver_ref: gets a refcount for the driver. Note this function takes
+ * a refcount for the driver assigned to the current cpu.
+ *
+ * Returns a struct cpuidle_driver pointer, or NULL if no driver is registered
+ * for the current cpu
+ */
struct cpuidle_driver *cpuidle_driver_ref(void)
{
struct cpuidle_driver *drv;
@@ -257,6 +330,10 @@ struct cpuidle_driver *cpuidle_driver_ref(void)
return drv;
}
+/**
+ * cpuidle_driver_unref: puts down the refcount for the driver. Note this
+ * function decrement the refcount for the driver assigned to the current cpu.
+ */
void cpuidle_driver_unref(void)
{
struct cpuidle_driver *drv = cpuidle_get_driver();
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 3c86faa..e7a94db 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -101,16 +101,20 @@ static inline int cpuidle_get_last_residency(struct cpuidle_device *dev)
****************************/
struct cpuidle_driver {
- const char *name;
- struct module *owner;
- int refcnt;
+ const char *name;
+ struct module *owner;
+ int refcnt;
/* used by the cpuidle framework to setup the broadcast timer */
- unsigned int bctimer:1;
+ unsigned int bctimer:1;
+
/* states array must be ordered in decreasing power consumption */
- struct cpuidle_state states[CPUIDLE_STATE_MAX];
- int state_count;
- int safe_state_index;
+ struct cpuidle_state states[CPUIDLE_STATE_MAX];
+ int state_count;
+ int safe_state_index;
+
+ /* the driver handles the cpus in cpumask */
+ const struct cpumask *cpumask;
};
#ifdef CONFIG_CPU_IDLE
@@ -135,9 +139,6 @@ extern void cpuidle_disable_device(struct cpuidle_device *dev);
extern int cpuidle_play_dead(void);
extern struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev);
-extern int cpuidle_register_cpu_driver(struct cpuidle_driver *drv, int cpu);
-extern void cpuidle_unregister_cpu_driver(struct cpuidle_driver *drv, int cpu);
-
#else
static inline void disable_cpuidle(void) { }
static inline int cpuidle_idle_call(void) { return -ENODEV; }
--
1.7.9.5
Hi,
This patchset takes advantage of the new per-task load tracking that is
available in the kernel for packing the small tasks in as few as possible
CPU/Cluster/Core. The main goal of packing small tasks is to reduce the power
consumption in the low load use cases by minimizing the number of power domain
that are enabled. The packing is done in 2 steps:
The 1st step looks for the best place to pack tasks in a system according to
its topology and it defines a pack buddy CPU for each CPU if there is one
available. We define the best CPU during the build of the sched_domain instead
of evaluating it at runtime because it can be difficult to define a stable
buddy CPU in a low CPU load situation. The policy for defining a buddy CPU is
that we pack at all levels inside a node where a group of CPU can be power
gated independently from others. For describing this capability, a new flag
has been introduced SD_SHARE_POWERDOMAIN that is used to indicate whether the
groups of CPUs of a scheduling domain are sharing their power state. By
default, this flag has been set in all sched_domain in order to keep unchanged
the current behavior of the scheduler and only ARM platform clears the
SD_SHARE_POWERDOMAIN flag for MC and CPU level.
In a 2nd step, the scheduler checks the load average of a task which wakes up
as well as the load average of the buddy CPU and it can decide to migrate the
light tasks on a not busy buddy. This check is done during the wake up because
small tasks tend to wake up between periodic load balance and asynchronously
to each other which prevents the default mechanism to catch and migrate them
efficiently. A light task is defined by a runnable_avg_sum that is less than
20% of the runnable_avg_period. In fact, the former condition encloses 2 ones:
The average CPU load of the task must be less than 20% and the task must have
been runnable less than 10ms when it woke up last time in order to be
electable for the packing migration. So, a task than runs 1 ms each 5ms will
be considered as a small task but a task that runs 50 ms with a period of
500ms, will not.
Then, the business of the buddy CPU depends of the load average for the rq and
the number of running tasks. A CPU with a load average greater than 50% will
be considered as busy CPU whatever the number of running tasks is and this
threshold will be reduced by the number of running tasks in order to not
increase too much the wake up latency of a task. When the buddy CPU is busy,
the scheduler falls back to default CFS policy.
Change since V2:
- Migrate only a task that wakes up
- Change the light tasks threshold to 20%
- Change the loaded CPU threshold to not pull tasks if the current number of
running tasks is null but the load average is already greater than 50%
- Fix the algorithm for selecting the buddy CPU.
Change since V1:
Patch 2/6
- Change the flag name which was not clear. The new name is
SD_SHARE_POWERDOMAIN.
- Create an architecture dependent function to tune the sched_domain flags
Patch 3/6
- Fix issues in the algorithm that looks for the best buddy CPU
- Use pr_debug instead of pr_info
- Fix for uniprocessor
Patch 4/6
- Remove the use of usage_avg_sum which has not been merged
Patch 5/6
- Change the way the coherency of runnable_avg_sum and runnable_avg_period is
ensured
Patch 6/6
- Use the arch dependent function to set/clear SD_SHARE_POWERDOMAIN for ARM
platform
New results for v3:
This series has been tested with hackbench on ARM platform and the results
don't show any performance regression
Hackbench 3.9-rc2 +patches
Mean Time (10 tests): 2.048 2.015
stdev : 0.047 0.068
Previous results for V2:
This series has been tested with MP3 play back on ARM platform:
TC2 HMP (dual CA-15 and 3xCA-7 cluster).
The measurements have been done on an Ubuntu image during 60 seconds of
playback and the result has been normalized to 100.
| CA15 | CA7 | total |
-------------------------------------
default | 81 | 97 | 178 |
pack | 13 | 100 | 113 |
-------------------------------------
Previous results for V1:
The patch-set has been tested on ARM platforms: quad CA-9 SMP and TC2 HMP
(dual CA-15 and 3xCA-7 cluster). For ARM platform, the results have
demonstrated that it's worth packing small tasks at all topology levels.
The performance tests have been done on both platforms with sysbench. The
results don't show any performance regressions. These results are aligned with
the policy which uses the normal behavior with heavy use cases.
test: sysbench --test=cpu --num-threads=N --max-requests=R run
Results below is the average duration of 3 tests on the quad CA-9.
default is the current scheduler behavior (pack buddy CPU is -1)
pack is the scheduler with the pack mechanism
| default | pack |
-----------------------------------
N=8; R=200 | 3.1999 | 3.1921 |
N=8; R=2000 | 31.4939 | 31.4844 |
N=12; R=200 | 3.2043 | 3.2084 |
N=12; R=2000 | 31.4897 | 31.4831 |
N=16; R=200 | 3.1774 | 3.1824 |
N=16; R=2000 | 31.4899 | 31.4897 |
-----------------------------------
The power consumption tests have been done only on TC2 platform which has got
accessible power lines and I have used cyclictest to simulate small tasks. The
tests show some power consumption improvements.
test: cyclictest -t 8 -q -e 1000000 -D 20 & cyclictest -t 8 -q -e 1000000 -D 20
The measurements have been done during 16 seconds and the result has been
normalized to 100
| CA15 | CA7 | total |
-------------------------------------
default | 100 | 40 | 140 |
pack | <1 | 45 | <46 |
-------------------------------------
The A15 cluster is less power efficient than the A7 cluster but if we assume
that the tasks is well spread on both clusters, we can guest estimate that the
power consumption on a dual cluster of CA7 would have been for a default
kernel:
| CA7 | CA7 | total |
-------------------------------------
default | 40 | 40 | 80 |
-------------------------------------
Vincent Guittot (6):
Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for
load-tracking"
sched: add a new SD_SHARE_POWERDOMAIN flag for sched_domain
sched: pack small tasks
sched: secure access to other CPU statistics
sched: pack the idle load balance
ARM: sched: clear SD_SHARE_POWERDOMAIN
arch/arm/kernel/topology.c | 9 +++
arch/ia64/include/asm/topology.h | 1 +
arch/tile/include/asm/topology.h | 1 +
include/linux/sched.h | 9 +--
include/linux/topology.h | 4 +
kernel/sched/core.c | 14 ++--
kernel/sched/fair.c | 149 +++++++++++++++++++++++++++++++++++---
kernel/sched/sched.h | 14 ++--
8 files changed, 169 insertions(+), 32 deletions(-)
--
1.7.9.5