This patch series implements get_user_pages_fast on ARM. Unlike other
architectures, we do not use IPIs/disabled IRQs as a blocking
mechanism to protect the page table walker. Instead an atomic counter
is used to indicate how many fast gup walkers are active on an address
space, and any code that would cause them problems (THP splitting or
code that could free a page table page) spins on positive values of
this counter.
This series also addresses an assumption made in kernel/futex.c that
THP page splitting can be blocked by disabling the IRQs on a processor
by introducing arch_block_thp_split and arch_unblock_thp_split.
As well as fixing a problem where futexes on THP tails cause hangs on
ARM, I expect this series to also be beneficial for direct-IO, and for
KVM (the hva_to_pfn fast path uses __get_user_pages_fast).
Any comments would be greatly appreciated.
Steve Capper (2):
thp: Introduce arch_(un)block_thp_split
arm: mm: implement get_user_pages_fast
arch/arm/include/asm/mmu.h | 1 +
arch/arm/include/asm/pgalloc.h | 9 ++
arch/arm/include/asm/pgtable-2level.h | 1 +
arch/arm/include/asm/pgtable-3level.h | 21 +++
arch/arm/include/asm/pgtable.h | 18 +++
arch/arm/include/asm/tlb.h | 8 ++
arch/arm/mm/Makefile | 2 +-
arch/arm/mm/gup.c | 234 ++++++++++++++++++++++++++++++++++
include/linux/huge_mm.h | 16 +++
kernel/futex.c | 6 +-
10 files changed, 312 insertions(+), 4 deletions(-)
create mode 100644 arch/arm/mm/gup.c
--
1.8.1.4
Hi Rafael,
I have pushed ARM cpufreq patches for v3.13, you can pull them from my
repo. Sorry if I am late...
The following changes since commit 959f58544b7f20c92d5eb43d1232c96c15c01bfb:
Linux 3.12-rc7 (2013-10-27 16:12:03 -0700)
are available in the git repository at:
git://git.linaro.org/people/vireshk/linux.git cpufreq-next-for-3.13
for you to fetch changes up to 2332c7a7a8c1979de68429dcdcb125037ef35f4b:
cpufreq: arm_big_little: reconfigure switcher behavior at run time
(2013-10-29 03:18:41 +0530)
----------------------------------------------------------------
Nicolas Pitre (1):
cpufreq: arm_big_little: reconfigure switcher behavior at run time
Sudeep KarkadaNagesha (5):
cpufreq: arm-big-little: use clk_get instead of clk_get_sys
ARM: vexpress/TC2: add support for CPU DVFS
ARM: vexpress/TC2: add cpu clock support
cpufreq: arm_big_little: add vexpress SPC interface driver
ARM: vexpress/TC2: register vexpress-spc cpufreq device
Viresh Kumar (1):
cpufreq: arm_big_little: add in-kernel switching (IKS) support
arch/arm/mach-vexpress/Kconfig | 12 +
arch/arm/mach-vexpress/Makefile | 3 +-
arch/arm/mach-vexpress/spc.c | 366 +++++++++++++++++++++++++++-
arch/arm/mach-vexpress/spc.h | 2 +-
arch/arm/mach-vexpress/tc2_pm.c | 7 +-
drivers/cpufreq/Kconfig.arm | 8 +
drivers/cpufreq/Makefile | 1 +
drivers/cpufreq/arm_big_little.c | 420 ++++++++++++++++++++++++++++++---
drivers/cpufreq/arm_big_little.h | 5 -
drivers/cpufreq/vexpress-spc-cpufreq.c | 69 ++++++
10 files changed, 853 insertions(+), 40 deletions(-)
create mode 100644 drivers/cpufreq/vexpress-spc-cpufreq.c
Hi All,
a) As per GIC-400 all Physical interrupts trap into hypervisor
b) Hypervisor does ACK, programs Virtual GIC list registers (with
PhysIRQ:VIRQ) and does a world switch.
c) GIC CPU I/f interrupts Guest with the VIRQ
d) Guest does a ACK and EOI to GIC cpu i/f
e) Hypervisor gets a maintenance interrupt when Guest Does an EOI
f) Hypervisor then clears the Physical Interrupt
So for 1 interrupt there are so many context switches ? Is the
sequence right. If I am missing anything please let me know ..
Also, If a device is private to a guest, so many context switches
would reduce the performance if the device interrupts a lot.
My question is that
a) Is the above flow correct ?
b) Is this the only flow or there exists some optimisations
Thanks and Regards