This patch series adds support for user space save/restore of the VGIC
state. Instead of expanding the ONE_REG interface, which works on
VCPUs, we first introduce support for the new KVM device control API and
the VGIC. Now, instead of calling KVM_CREATE_IRQCHIP, user space can
call KVM_CREATE_DEVICE and perform operations on the device fd, such as
KVM_SET_DEVICE_ATTR to set a device attribute.
We leverage the KVM_{SET/GET}_DEVICE_ATTR API to export the state of the
VGIC to user space. Instead of coming up with our own custom format for
exporting the VGIC state, we simply export all the state visible to an
emulated guest, which must contain the full GIC state to provide
save/restore of the GIC state for power management purposes. This
further provides the benefit of being able to re-use the MMIO emulation
code for the distributor for save/restore.
However, the need to save/restore cpu-specific state demands that user
space can save/restore state accessible through the CPU interface, and
we therefore add an emulation interface for the CPU-specific interface.
This is considered a first attempt, and I am not married to the device
control API. If there are good technical arguments to take another
approach, I am of course willing to discuss this. However, my attempts
with the ONE_REG interface did not look very nice.
[ WARINING: The patch set core functionality is completely untested;
the basic KVM system has been briefly tested on TC2 and it doesn't
seem like I've broken existing functionality. ]
I wanted to get this out early to get feedback on the overall API and
idea, and I'm writing some user QEMU for the user space side to test
the new functionality meanwhile.
Patches are against kvm-arm-next and also available here:
git://git.linaro.org/people/cdall/linux-kvm-arm.git vgic-migrate
Christoffer Dall (7):
KVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC
KVM: arm-vgic: Set base addr through device API
irqchip: arm-gic: Define additional MMIO offsets and masks
KVM: arm-vgic: Make vgic mmio functions more generic
KVM: arm-vgic: Add vgic reg access from dev attr
KVM: arm-vgic: Add GICD_SPENDSGIR and GICD_CPENDSGIR handlers
KVM: arm-vgic: Support CPU interface reg access
Documentation/virtual/kvm/api.txt | 5 +-
Documentation/virtual/kvm/devices/arm-vgic.txt | 52 +++
arch/arm/include/uapi/asm/kvm.h | 8 +
arch/arm/kvm/arm.c | 3 +-
include/kvm/arm_vgic.h | 2 +-
include/linux/irqchip/arm-gic.h | 14 +
include/linux/kvm_host.h | 1 +
include/uapi/linux/kvm.h | 1 +
virt/kvm/arm/vgic.c | 452 +++++++++++++++++++++++-
virt/kvm/kvm_main.c | 4 +
10 files changed, 522 insertions(+), 20 deletions(-)
create mode 100644 Documentation/virtual/kvm/devices/arm-vgic.txt
--
1.7.9.5
Hi Todd and others,
If we have a multi-package system, where we have multiple instances of struct
policy (per package), currently we can't have multiple instances of same
governor. i.e. We can't have multiple instances of Interactive governor for
multiple packages.
This is a bottleneck for multicluster system, where we want different packages
to use Interactive governor, but with different tunables.
---------x------------x---------
Recently, I have upstreamed this support in 3.10-rc1 for cpufreq core, Ondemand
and Conservative governor. Now is an attempt for Interactive Governor.
I didn't had any clue on what kernel to rebase my patches over as I couldn't
find a 3.10-rc based branch in your tree and so based it on
experimental/android-3.9.
So, this is what this patchset does:
- Backports some important patches from v3.10-rc1/2 to v3.9: First 8 patches
- Added few more supportive patches which might go in rc3: Next 4 patches
- Finally updated Interactive governor: Last 4 patches
So, Review is probably required only for last 4 patches. The last patch is a bit
long, it is mostly rearrangement of the code rather then major update. It is
based on the patchset which I wrote for Ondemand/Conservative governor.
This has been tested on ARM big LITTLE platform which has multiple packages
requiring separate tunables.
Nathan Zimmer (1):
cpufreq: Convert the cpufreq_driver_lock to a rwlock
Stratos Karafotis (1):
cpufreq: governors: Calculate iowait time only when necessary
Viresh Kumar (14):
cpufreq: Add per policy governor-init/exit infrastructure
cpufreq: governor: Implement per policy instances of governors
cpufreq: Call __cpufreq_governor() with correct policy->cpus mask
cpufreq: Don't call __cpufreq_governor() for drivers without target()
cpufreq: governors: Fix CPUFREQ_GOV_POLICY_{INIT|EXIT} notifiers
cpufreq: Issue CPUFREQ_GOV_POLICY_EXIT notifier before dropping
policy refcount
cpufreq: Add EXPORT_SYMBOL_GPL for have_governor_per_policy
cpufreq: governors: Move get_governor_parent_kobj() to cpufreq.c
cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT
cpufreq: Move get_cpu_idle_time() to cpufreq.c
cpufreq: interactive: Use generic get_cpu_idle_time() from cpufreq.c
cpufreq: interactive: Remove unnecessary cpu_online() check
cpufreq: interactive: Move definition of cpufreq_gov_interactive
downwards
cpufreq: Interactive: Implement per policy instances of governor
drivers/cpufreq/cpufreq.c | 157 ++++++--
drivers/cpufreq/cpufreq_conservative.c | 195 ++++++----
drivers/cpufreq/cpufreq_governor.c | 273 +++++++-------
drivers/cpufreq/cpufreq_governor.h | 120 +++++-
drivers/cpufreq/cpufreq_interactive.c | 663 +++++++++++++++++++--------------
drivers/cpufreq/cpufreq_ondemand.c | 274 ++++++++------
include/linux/cpufreq.h | 19 +-
7 files changed, 1043 insertions(+), 658 deletions(-)
--
1.7.12.rc2.18.g61b472e
Hi Peter/Ingo,
This set contains few more minor fixes that I could find for code responsible
for creating sched domains. They are rebased of my earlier fixes:
Part 1:
https://lkml.org/lkml/2013/6/4/253
Part 2:
https://lkml.org/lkml/2013/6/10/141
They should be applied in this order to avoid conflicts.
My study of "How scheduling domains are created" is almost over now and so
probably this is my last patchset for fixes related to scheduling domains.
Sorry for three separate sets, I sent them as soon as I had few of them sitting
in my tree.
Viresh Kumar (3):
sched: Use cached value of span instead of calling
sched_domain_span()
sched: don't call get_group() for covered cpus
sched: remove WARN_ON(!sd) from init_sched_groups_power()
kernel/sched/core.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--
1.7.12.rc2.18.g61b472e
Good day Jon,
Please include the included patch in your tree. It is a fix for [1].
Thanks,
Mathieu.
[1]. https://bugs.launchpad.net/linaro-big-little-system/+bug/1097213
-------- Original Message --------
Subject: Re: Update on LP1097213
Date: Mon, 17 Jun 2013 16:31:47 +0100
From: Morten Rasmussen <morten.rasmussen(a)arm.com>
To: Mathieu Poirier <mathieu.poirier(a)linaro.org>
CC: Vincent Guittot <vincent.guittot(a)linaro.org>, Serge Broslavsky
<serge.broslavsky(a)linaro.org>, Amit Kucheria <amit.kucheria(a)linaro.org>,
Nicolas Pitre <nicolas.pitre(a)linaro.org>, Naresh Kamboju
<naresh.kamboju(a)linaro.org>
Hi Mathieu,
I had a quick look at the hmp_next_{up,down}_delay() stuff. It is all
introduced in the patch: "sched: SCHED_HMP multi-domain task migration
control". Reverting it requires some manual conflict fixing and you will
also need to remove the extra hmp_next_down_delay() added by a later patch.
I've attached a revert patch for debugging purposes that should do it all.
I'm not sure if this will just remove the symptom or if the sched_clock
accesses are the true cause of the problem.
I hope it helps,
Morten
On 17/06/13 14:26, Vincent Guittot wrote:
> Mathieu,
>
> Please find below the mail we have discussed during the call
>
> Vincent
>
> On 14 June 2013 15:21, Vincent Guittot <vincent.guittot(a)linaro.org> wrote:
>> On 14 June 2013 15:14, Vincent Guittot <vincent.guittot(a)linaro.org> wrote:
>>> On 14 June 2013 14:39, Mathieu Poirier <mathieu.poirier(a)linaro.org> wrote:
>>>> Anything on this ?!? Morten, Vincent ?
>>>
>>> Hi Mathieu,
>>>
>>> I haven't noticed that the problem can be reproduced on a snowball,
>>> the 1st time i read your email.
>>> It's means that the hmp specific function are also called on smp system ?
>>>
>>> I'm going to look more ddeplyin the code
>>>
>>
>> for_each_online_cpu is used in hmp_force_up_migration but it's not
>> protected against hotplug so it can used a cpu that is going to be
>> unplugged
>>
>> We should probably protect the sequence with get/put_online_cpus
>>
>> Vincent
>>
>>> Vincent
>>>
>>>>
>>>> On 13-06-12 03:13 PM, Mathieu Poirier wrote:
>>>>> Good day gents,
>>>>>
>>>>> I have been working on [1] for a while now, on and off as time
>>>>> permitted. The problem has always been very elusive but definitely
>>>>> present. As some of the notes in the bug report indicate TC2 wasn't the
>>>>> only ARM system I could reproduce this on - snowball suffered from the
>>>>> exact same problem.
>>>>>
>>>>> I started looking at this again for 3.10 and I have good and bad news.
>>>>>
>>>>> The good news is that I can't reproduce the problem anymore if
>>>>> CONFIG_SCHED_HMP is not enabled. I ran the attached script for more
>>>>> than 16 hours without even the hint of a problem. Normally one would
>>>>> get a crash [2] in less than a minute. I won't go so far as claiming
>>>>> that upstream solved the problem. Maybe we are lucky and timing in 3.10
>>>>> simply doesn't allow for the fault to occur. In any case, all we can do
>>>>> is continue monitoring the situation in upcoming versions.
>>>>>
>>>>> On the flip side we have a definite problem with hotplug when
>>>>> CONFIG_SCHED_HMP is defined. The crash in [2] is consistent and can be
>>>>> reproduced at will. Looking at the trace the problem happens in
>>>>> 'select_task_rq_fair' where calls to 'hmp_next_up_delay' and
>>>>> 'hmp_next_down_delay' end up referencing 'cfs_rq_clock_task' where
>>>>> cfs-rq->rq point to a bogus address.
>>>>>
>>>>> Have a look at line 9 in [2] - this is a little bit of instrumentation I
>>>>> started working on. It basically outputs the new and previous CPUs in
>>>>> 'hmp_[up,down]_migration' conditional statements along with the
>>>>> direction of the migration [3]. In every instances the system was going
>>>>> from the A15 to the A7 cluster. I haven't found a single instance where
>>>>> the opposite was be true.
>>>>>
>>>>> Since this is directly related to our efforts to make the scheduler
>>>>> power aware and based on Ingo's latest rebuttal, I am not sure that it
>>>>> wise for me to continue working on this - specifically if we end up
>>>>> scrapping that portion of the code. I'm eager to hear your opinion.
>>>>>
>>>>> On the flip side it highlights (once again) that we need to invest
>>>>> massively in the hotplug subsystem, more specifically in its relation to
>>>>> the scheduler and the RCU subsystem.
>>>>>
>>>>> Mathieu.
>>>>>
>>>>> PS. I have purposely kept the audience to a minimum - forward as you
>>>>> see fit.
>>>>>
>>>>> [1]. https://bugs.launchpad.net/linaro-big-little-system/+bug/1188778
>>>>> [2]. https://pastebin.linaro.org/view/0751c84b
>>>>> [3]. https://pastebin.linaro.org/view/4491ee27
>>>>>
>>>>
>
-- IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended
recipient, please notify the sender immediately and do not disclose the
contents to any other person, use it for any purpose, or store or copy
the information in any medium. Thank you.