For bytemaps each IRQ field is 1 byte wide, so we pack 4 irq fields in
one word and since there are 32 private (per cpu) irqs, we have 8
private u32 fields on the vgic_bytemap struct. We shift the offset from
the base of the register group right by 2, giving us the word index
instead of the field index. But then there are 8 private words, not 4,
which is also why we subtract 8 words from the offset of the shared
words.
Signed-off-by: Christoffer Dall <christoffer.dall(a)linaro.org>
---
virt/kvm/arm/vgic.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 17c5ac7..96d7aa4 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -149,7 +149,7 @@ static u32 *vgic_bytemap_get_reg(struct vgic_bytemap *x, int cpuid, u32 offset)
{
offset >>= 2;
BUG_ON(offset > (VGIC_NR_IRQS / 4));
- if (offset < 4)
+ if (offset < 8)
return x->percpu[cpuid] + offset;
else
return x->shared + offset - 8;
--
1.7.10.4
This patch set contains userland changes necessary for out-of-the-box
support of persistent events. These patches are follow on patches of
the kernel patches I sent out today:
[PATCH 00/16] perf, persistent: Kernel updates for perf tool integration
Persistent events are always enabled kernel events. Buffers are mapped
readonly and multiple users are allowed. The persistent event flag of
the event attribute must be set to specify such an event.
The following changes to perf tools are necessary to support
persistent events. A way is needed to specify sysfs entries to set
event flags. For this a new syntax 'attr<num>' was added to the event
parser, see patch #3. We also need to change perf tools to mmap
persistent event buffers readonly.
All patches can be found here:
git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile.git persistent-v3
-Robert
Robert Richter (4):
perf tools: Rename flex conditions to avoid name conflicts
perf tools: Modify event parser to update event attribute by index
perf tools: Add attr<num> syntax to event parser
perf tools: Retry mapping buffers readonly on EACCES
tools/perf/builtin-record.c | 11 +++++---
tools/perf/builtin-top.c | 8 ++++--
tools/perf/perf.h | 1 +
tools/perf/tests/parse-events.c | 12 ++++++---
tools/perf/util/parse-events.c | 59 +++++++++++++++++++----------------------
tools/perf/util/parse-events.h | 12 ++++-----
tools/perf/util/parse-events.l | 56 +++++++++++++++++++++++---------------
tools/perf/util/parse-events.y | 24 ++++++++++-------
tools/perf/util/pmu.c | 32 +++++-----------------
tools/perf/util/pmu.h | 9 ++-----
tools/perf/util/pmu.l | 1 +
tools/perf/util/pmu.y | 18 ++++++++++---
12 files changed, 129 insertions(+), 114 deletions(-)
--
1.8.3.2
This patch set implements the necessary kernel changes for persistent
events.
Persistent events run standalone in the system without the need of a
controlling process that holds an event's file descriptor. The events
are always enabled and collect data samples in a ring buffer.
Processes may connect to existing persistent events using the
perf_event_open() syscall. For this the syscall must be configured
using the new PERF_TYPE_PERSISTENT event type and a unique event
identifier specified in attr.config. The id is propagated in sysfs or
using ioctl (see below).
Persistent event buffers may be accessed with mmap() in the same way
as for any other event. Since the buffers may be used by multiple
processes at the same time, there is only read-only access to them.
Currently there is only support for per-cpu events, thus root access
is needed too.
Persistent events are visible in sysfs. They are added or removed
dynamically. With the information in sysfs userland knows about how to
setup the perf_event attribute of a persistent event. Since a
persistent event always has the persistent flag set, a way is needed
to express this in sysfs. A new syntax is used for this. With
'attr<num>:<mask>' any bit in the attribute structure may be set in a
similar way as using 'config<num>', but <num> is an index that points
to the u64 value to change within the attribute.
For persistent events the persistent flag (bit 23 of flag field in
struct perf_event_attr) needs to be set which is expressed in sysfs
with "attr5:23". E.g. the mce_record event is described in sysfs as
follows:
/sys/bus/event_source/devices/persistent/events/mce_record:persistent,config=106
/sys/bus/event_source/devices/persistent/format/persistent:attr5:23
Note that perf tools need to support the 'attr<num>' syntax that is
added in a separate patch set. With it we are able to run perf tool
commands to read persistent events, e.g.:
# perf record -e persistent/mce_record/ sleep 10
# perf top -e persistent/mce_record/
In general the new syntax is flexible to describe with sysfs any event
to be setup by perf tools.
There are ioctl functions to control persistent events that can be
used to detach or attach an event to or from a process. The
PERF_EVENT_IOC_DETACH ioctl call makes an event persistent. The
perf_event_open() syscall can be used to re-open the event by any
process. The PERF_EVENT_IOC_ATTACH ioctl attaches the event again so
that it is removed after closing the event's fd.
The patches base on the originally work from Borislav Petkov.
This version 3 of the patch set is a complete rework of the code.
There are the following major changes:
* new event type PERF_TYPE_PERSISTENT introduced,
* support for all type of events,
* unique event ids,
* improvements in reference counting and locking,
* ioctl functions are added to control persistency,
* the sysfs implementation now uses variable list size.
This should address most issues discussed during last review of
version 2. The following is unresolved yet and can be added later on
top of this patches, if necessary:
* support for per-task events (also allowing non-root access),
* creation of persistent events for disabled cpus,
* make event persistent with already open (mmap'ed) buffers,
* make event persistent while creating it.
First patches contain some rework of the perf mmap code to reuse it
for persistent events.
Also note that patch 12 (ioctl functions to control persistency) is
RFC and untested. A perf tools implementation for this is missing and
some ideas are needed how this could be integrated, esp. in something
like perf trace or so.
All patches can be found here:
git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile.git persistent-v3
Note: I will resent the perf tools patch necessary to use persistent
events.
-Robert
Borislav Petkov (1):
mce, x86: Enable persistent events
Robert Richter (11):
perf, mmap: Factor out ring_buffer_detach_all()
perf, mmap: Factor out try_get_event()/put_event()
perf, mmap: Factor out perf_alloc/free_rb()
perf, mmap: Factor out perf_get_fd()
perf: Add persistent events
perf, persistent: Implementing a persistent pmu
perf, persistent: Exposing persistent events using sysfs
perf, persistent: Use unique event ids
perf, persistent: Implement reference counter for events
perf, persistent: Dynamically resize list of sysfs entries
[RFC] perf, persistent: ioctl functions to control persistency
.../testing/sysfs-bus-event_source-devices-format | 43 +-
arch/x86/kernel/cpu/mcheck/mce.c | 19 +
include/linux/perf_event.h | 12 +-
include/uapi/linux/perf_event.h | 6 +-
kernel/events/Makefile | 2 +-
kernel/events/core.c | 210 +++++---
kernel/events/internal.h | 20 +
kernel/events/persistent.c | 563 +++++++++++++++++++++
8 files changed, 779 insertions(+), 96 deletions(-)
create mode 100644 kernel/events/persistent.c
--
1.8.3.2
This patchset does the following 3 things:
1) Fixes the RTC DT node name for Exynos5250 SoC
2) Update the "status" property of RTC DT node for Exynos5250 SoC
3) Adds RTC DT node to Exynos5420 SoC
changes since v4:
- removed "status" property of RTC DT node from exynos5250 board dts files
changes since v3:
- split the 5250 related modifications into 2 separate patch as
suggested by Tomasz Figa <t.figa(a)samsung.com>
changes since v2:
- split the 5250 related modifications into a separate patch.
- placed the RTC node as per the alphabetical order in the DTS file as
suggested by Kukjin Kim <kgene(a)kernel.org>.
changes since v1:
- made DT node status as "okay" in the dtsi file itself.
Vikas Sajjan (3):
ARM: dts: Fix the RTC DT node name for Exynos5250 SoC
ARM: dts: Update the "status" property of RTC DT node for Exynos5250
SoC
ARM: dts: Add RTC DT node to Exynos5420 SoC
arch/arm/boot/dts/exynos5.dtsi | 2 +-
arch/arm/boot/dts/exynos5250-arndale.dts | 4 ----
arch/arm/boot/dts/exynos5250-snow.dts | 4 ----
arch/arm/boot/dts/exynos5250.dtsi | 3 ++-
arch/arm/boot/dts/exynos5420.dtsi | 6 ++++++
5 files changed, 9 insertions(+), 10 deletions(-)
--
1.7.9.5
Hi all,
This patch set introduces a buffer synchronization framework based
on DMA BUF[1] and based on ww-mutexes[2] for lock mechanism, and
has been rebased on linux-3.11-rc6.
The purpose of this framework is to provide not only buffer access
control to CPU and CPU, and CPU and DMA, and DMA and DMA but also
easy-to-use interfaces for device drivers and user application.
In addtion, this patch set suggests a way for enhancing performance.
Changelog v7:
Fix things pointed out by Konrad Rzeszutek Wilk,
- Use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL.
- Make sure to unlock and unreference all dmabuf objects
when dmabuf_sync_fini() is called.
- Add more comments.
- Code cleanups.
Changelog v6:
- Fix sync lock to multiple reads.
- Add select system call support.
. Wake up poll_wait when a dmabuf is unlocked.
- Remove unnecessary the use of mutex lock.
- Add private backend ops callbacks.
. This ops has one callback for device drivers to clean up their
sync object resource when the sync object is freed. For this,
device drivers should implement the free callback properly.
- Update document file.
Changelog v5:
- Rmove a dependence on reservation_object: the reservation_object is used
to hook up to ttm and dma-buf for easy sharing of reservations across
devices. However, the dmabuf sync can be used for all dma devices; v4l2
and drm based drivers, so doesn't need the reservation_object anymore.
With regared to this, it adds 'void *sync' to dma_buf structure.
- All patches are rebased on mainline, Linux v3.10.
Changelog v4:
- Add user side interface for buffer synchronization mechanism and update
descriptions related to the user side interface.
Changelog v3:
- remove cache operation relevant codes and update document file.
Changelog v2:
- use atomic_add_unless to avoid potential bug.
- add a macro for checking valid access type.
- code clean.
For generic user mode interface, we have used fcntl and select system
call[3]. As you know, user application sees a buffer object as a dma-buf
file descriptor. So fcntl() call with the file descriptor means to lock
some buffer region being managed by the dma-buf object. And select() call
means to wait for the completion of CPU or DMA access to the dma-buf
without locking. For more detail, you can refer to the dma-buf-sync.txt
in Documentation/
There are some cases we should use this buffer synchronization framework.
One of which is to primarily enhance GPU rendering performance on Tizen
platform in case of 3d app with compositing mode that 3d app draws
something in off-screen buffer, and Web app.
In case of 3d app with compositing mode which is not a full screen mode,
the app calls glFlush to submit 3d commands to GPU driver instead of
glFinish for more performance. The reason we call glFlush is that glFinish
blocks caller's task until the execution of the 2d commands is completed.
Thus, that makes GPU and CPU more idle. As result, 3d rendering performance
with glFinish is quite lower than glFlush. However, the use of glFlush has
one issue that the a buffer shared with GPU could be broken when CPU
accesses the buffer at once after glFlush because CPU cannot be aware of
the completion of GPU access to the buffer. Of course, the app can be aware
of that time using eglWaitGL but this function is valid only in case of the
same process.
The below summarizes how app's window is displayed on Tizen platform:
1. X client requests a window buffer to Xorg.
2. X client draws something in the window buffer using CPU.
3. X client requests SWAP to Xorg.
4. Xorg notifies a damage event to Composite Manager.
5. Composite Manager gets the window buffer (front buffer) through
DRI2GetBuffers.
6. Composite Manager composes the window buffer and its own back buffer
using GPU. At this time, eglSwapBuffers is called: internally, 3d
commands are flushed to gpu driver.
7. Composite Manager requests SWAP to Xorg.
8. Xorg performs drm page flip. At this time, the window buffer is
displayed on screen.
Web app based on HTML5 also has the same issue. Web browser and its web app
are different process. The web app draws something in its own pixmap buffer,
and then the web browser gets a window buffer from Xorg, and then composites
the pixmap buffer with the window buffer. And finally, page flip.
Thus, in such cases, a shared buffer could be broken as one process draws
something in pixmap buffer using CPU, when other process composites the
pixmap buffer with window buffer using GPU without any locking mechanism.
That is why we need user land locking interface, fcntl system call.
And last one is a deferred page flip issue. This issue is that a window
buffer rendered can be displayed on screen in about 32ms in worst case:
assume that the gpu rendering is completed within 16ms.
That can be incurred when compositing a pixmap buffer with a window buffer
using GPU and when vsync is just started. At this time, Xorg waits for
a vblank event to get a window buffer so 3d rendering will be delayed
up to about 16ms. As a result, the window buffer would be displayed in
about two vsyncs (about 32ms) and in turn, that would show slow
responsiveness.
For this, we could enhance the responsiveness with locking
mechanism: skipping one vblank wait. I guess in the similar reason,
Android, Chrome OS, and other platforms are using their own locking
mechanisms; Android sync driver, KDS, and DMA fence.
The below shows the deferred page flip issue in worst case,
|------------ <- vsync signal
|<------ DRI2GetBuffers
|
|
|
|------------ <- vsync signal
|<------ Request gpu rendering
time |
|
|<------ Request page flip (deferred)
|------------ <- vsync signal
|<------ Displayed on screen
|
|
|
|------------ <- vsync signal
Thanks,
Inki Dae
References:
[1] http://lwn.net/Articles/470339/
[2] https://patchwork.kernel.org/patch/2625361/
[3] http://linux.die.net/man/2/fcntl
Inki Dae (2):
dmabuf-sync: Add a buffer synchronization framework
dma-buf: Add user interfaces for dmabuf sync support
Documentation/dma-buf-sync.txt | 286 ++++++++++++++++
drivers/base/Kconfig | 7 +
drivers/base/Makefile | 1 +
drivers/base/dma-buf.c | 85 +++++
drivers/base/dmabuf-sync.c | 706 ++++++++++++++++++++++++++++++++++++++++
include/linux/dma-buf.h | 16 +
include/linux/dmabuf-sync.h | 236 ++++++++++++++
7 files changed, 1337 insertions(+), 0 deletions(-)
create mode 100644 Documentation/dma-buf-sync.txt
create mode 100644 drivers/base/dmabuf-sync.c
create mode 100644 include/linux/dmabuf-sync.h
--
1.7.5.4
Many CPUFreq drivers for SMP system (where all cores share same clock lines), do
similar stuff in their ->init() part.
This patch creates a generic routine in cpufreq core which can be used by these
so that we can remove some redundant code. And later part of patchset makes
other drivers use this infrastructure.
Many drivers which weren't setting policy->cpus haven't been updated as they
might have separate clocks for CPUs and setting all CPUs in policy->cpus may
corrupt them..
This is Sixth part of my cleanup work for CPUFreq, first five are (And
obviously its rebased over them):
1: cpufreq: Introduce cpufreq_table_validate_and_show()
https://lkml.org/lkml/2013/8/8/263
2: cpufreq: define generic routines for cpufreq drivers
https://lkml.org/lkml/2013/8/10/48
3. CPUFreq: Implement light weight ->target(): for 3.13
https://lkml.org/lkml/2013/8/13/349
4. CPUFreq: set policy->cur in cpufreq core instead of drivers
https://lkml.org/lkml/2013/8/14/288
5. CPUFreq: Move freq change notifications out of drivers
https://lkml.org/lkml/2013/8/15/506
All these are pushed here:
https://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/…
Viresh Kumar (14):
cpufreq: create cpufreq_generic_init() routine
cpufreq: cpu0: use cpufreq_generic_init() routine
cpufreq: dbx500: use cpufreq_generic_init() routine
cpufreq: exynos: use cpufreq_generic_init() routine
cpufreq: imx6q: use cpufreq_generic_init() routine
cpufreq: kirkwood: use cpufreq_generic_init() routine
cpufreq: maple: use cpufreq_generic_init() routine
cpufreq: pasemi: use cpufreq_generic_init() routine
cpufreq: pmac64: use cpufreq_generic_init() routine
cpufreq: s3c: use cpufreq_generic_init() routine
cpufreq: s5pv210: use cpufreq_generic_init() routine
cpufreq: sa11x0: use cpufreq_generic_init() routine
cpufreq: spear: use cpufreq_generic_init() routine
cpufreq: tegra: use cpufreq_generic_init() routine
drivers/cpufreq/cpufreq-cpu0.c | 19 +------------------
drivers/cpufreq/cpufreq.c | 31 +++++++++++++++++++++++++++++++
drivers/cpufreq/dbx500-cpufreq.c | 21 +--------------------
drivers/cpufreq/exynos-cpufreq.c | 7 +------
drivers/cpufreq/exynos5440-cpufreq.c | 14 ++------------
drivers/cpufreq/imx6q-cpufreq.c | 13 +------------
drivers/cpufreq/kirkwood-cpufreq.c | 5 +----
drivers/cpufreq/maple-cpufreq.c | 9 +--------
drivers/cpufreq/pasemi-cpufreq.c | 9 +--------
drivers/cpufreq/pmac64-cpufreq.c | 9 +--------
drivers/cpufreq/s3c2416-cpufreq.c | 6 ++----
drivers/cpufreq/s3c24xx-cpufreq.c | 13 +------------
drivers/cpufreq/s3c64xx-cpufreq.c | 5 ++---
drivers/cpufreq/s5pv210-cpufreq.c | 4 +---
drivers/cpufreq/sa1100-cpufreq.c | 6 +-----
drivers/cpufreq/sa1110-cpufreq.c | 6 +-----
drivers/cpufreq/spear-cpufreq.c | 14 ++------------
drivers/cpufreq/tegra-cpufreq.c | 14 +++++++++-----
include/linux/cpufreq.h | 3 +++
19 files changed, 63 insertions(+), 145 deletions(-)
--
1.7.12.rc2.18.g61b472e
vexpress_defconfig is known broken on some use-cases. This serie of
patches allow to boot test Versatile Express using a regular filesystem
on SD card.
Fathi Boudra (3):
ARM: vexpress_defconfig: Enable and automount devtmpfs filesystem
ARM: vexpress_defconfig: Enable voltage regulator support
ARM: vexpress_defconfig: Enable ext4 filesystem
arch/arm/configs/vexpress_defconfig | 5 +++++
1 file changed, 5 insertions(+)
--
1.8.1.2
Tegra's cpufreq driver was maintaining requested target frequencies in an array:
target_cpu_speed. And then finally setting the highest requested freq in the
core. This was probably done because both cores share clock line and logically
we want to set both cores to the max frequency requested..
But this wasn't required to be done in individual CPUFreq drivers, its already
taken care of by CPUFreq governors. They evaluate load for all CPUs and finally
call target only for the frequency corresponding to max load.
So, get rid of this stuff from Tegra's cpufreq driver.
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
Hi Stephen,
Its only build tested and depends on lots of stuff that I have already sent for
cpufreq core and its drivers. All of that is pushed here:
https://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/…
And only Tegra+cpufreq-core patches are pushed here (only 13 patches):
https://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/…
You can probably try cpufreq-next-tegra branch for testing on some real
hardware.
drivers/cpufreq/tegra-cpufreq.c | 35 ++++++-----------------------------
1 file changed, 6 insertions(+), 29 deletions(-)
diff --git a/drivers/cpufreq/tegra-cpufreq.c b/drivers/cpufreq/tegra-cpufreq.c
index b376b67..3f25ab6 100644
--- a/drivers/cpufreq/tegra-cpufreq.c
+++ b/drivers/cpufreq/tegra-cpufreq.c
@@ -47,7 +47,6 @@ static struct clk *pll_x_clk;
static struct clk *pll_p_clk;
static struct clk *emc_clk;
-static unsigned long target_cpu_speed[NUM_CPUS];
static DEFINE_MUTEX(tegra_cpu_lock);
static bool is_suspended;
@@ -103,9 +102,6 @@ static int tegra_update_cpu_speed(struct cpufreq_policy *policy,
{
int ret = 0;
- if (tegra_getspeed(0) == rate)
- return ret;
-
/*
* Vote on memory bus frequency based on cpu frequency
* This sets the minimum frequency, display or avp may request higher
@@ -125,35 +121,16 @@ static int tegra_update_cpu_speed(struct cpufreq_policy *policy,
return ret;
}
-static unsigned long tegra_cpu_highest_speed(void)
-{
- unsigned long rate = 0;
- int i;
-
- for_each_online_cpu(i)
- rate = max(rate, target_cpu_speed[i]);
- return rate;
-}
-
static int tegra_target(struct cpufreq_policy *policy, unsigned int index)
{
- unsigned int freq;
- int ret = 0;
+ int ret = -EBUSY;
mutex_lock(&tegra_cpu_lock);
- if (is_suspended) {
- ret = -EBUSY;
- goto out;
- }
-
- freq = freq_table[index].frequency;
+ if (!is_suspended)
+ ret = tegra_update_cpu_speed(policy,
+ freq_table[index].frequency);
- target_cpu_speed[policy->cpu] = freq;
-
- ret = tegra_update_cpu_speed(policy, tegra_cpu_highest_speed());
-
-out:
mutex_unlock(&tegra_cpu_lock);
return ret;
}
@@ -167,7 +144,8 @@ static int tegra_pm_notify(struct notifier_block *nb, unsigned long event,
is_suspended = true;
pr_info("Tegra cpufreq suspend: setting frequency to %d kHz\n",
freq_table[0].frequency);
- tegra_update_cpu_speed(policy, freq_table[0].frequency);
+ if (tegra_getspeed(0) != freq_table[0].frequency)
+ tegra_update_cpu_speed(policy, freq_table[0].frequency);
cpufreq_cpu_put(policy);
} else if (event == PM_POST_SUSPEND) {
is_suspended = false;
@@ -190,7 +168,6 @@ static int tegra_cpu_init(struct cpufreq_policy *policy)
clk_prepare_enable(cpu_clk);
cpufreq_table_validate_and_show(policy, freq_table);
- target_cpu_speed[policy->cpu] = tegra_getspeed(policy->cpu);
/* FIXME: what's the actual transition time? */
policy->cpuinfo.transition_latency = 300 * 1000;
--
1.7.12.rc2.18.g61b472e