Part of the energy aware scheduler work is concerned by test and measurement tools such as idlestat[1]. This relies on the ability to trace wake-up events, mainly IRQs and IPIs.
While IRQs are already well instrumented with tracepoints, IPIs are rather lacking on that front, and completely invisible on ARM.
This series defines generic IPI tracepoints and adds them to ARM and ARM64. An attempt at adding them to X86 is also included for comments.
[1] https://wiki.linaro.org/WorkingGroups/PowerManagement/Resources/Tools/Idlest...
diffstat:
arch/arm/kernel/smp.c | 72 ++++++++++++++++++++------------ arch/arm64/kernel/smp.c | 67 +++++++++++++++++++----------- arch/x86/kernel/smp.c | 16 ++++++++ include/trace/events/ipi.h | 89 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 192 insertions(+), 52 deletions(-)
The Inter Processor Interrupt is used to make another processor do a specific action such as rescheduling tasks, signal a timer event or execute something in another CPU's context. IRQs are already traceable but IPIs were not. Tracing them is useful for monitoring IPI latency, or to verify when they are the source of CPU wake-ups with power management implications.
Three trace hooks are defined: ipi_raise, ipi_entry and ipi_exit. To make them portable, a string is used to identify them and correlate related events. Additionally, ipi_raise records a bitmask representing targeted CPUs.
Signed-off-by: Nicolas Pitre nico@linaro.org --- include/trace/events/ipi.h | 89 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 89 insertions(+) create mode 100644 include/trace/events/ipi.h
diff --git a/include/trace/events/ipi.h b/include/trace/events/ipi.h new file mode 100644 index 0000000000..834a7362a6 --- /dev/null +++ b/include/trace/events/ipi.h @@ -0,0 +1,89 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM ipi + +#if !defined(_TRACE_IPI_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_IPI_H + +#include <linux/tracepoint.h> + +/** + * ipi_raise - called when a smp cross call is made + * + * @mask: mask of recipient CPUs for the IPI + * @reason: string identifying the IPI purpose + * + * It is necessary for @reason to be a static string declared with + * __tracepoint_string. + */ +TRACE_EVENT(ipi_raise, + + TP_PROTO(const struct cpumask *mask, const char *reason), + + TP_ARGS(mask, reason), + + TP_STRUCT__entry( + __bitmask(target_cpus, nr_cpumask_bits) + __field(const char *, reason) + ), + + TP_fast_assign( + __assign_bitmask(target_cpus, cpumask_bits(mask), nr_cpumask_bits); + __entry->reason = reason; + ), + + TP_printk("target_mask=%s (%s)", __get_bitmask(target_cpus), __entry->reason) +); + +DECLARE_EVENT_CLASS(ipi_handler, + + TP_PROTO(const char *reason), + + TP_ARGS(reason), + + TP_STRUCT__entry( + __field(const char *, reason) + ), + + TP_fast_assign( + __entry->reason = reason; + ), + + TP_printk("(%s)", __entry->reason) +); + +/** + * ipi_entry - called immediately before the IPI handler + * + * @reason: string identifying the IPI purpose + * + * It is necessary for @reason to be a static string declared with + * __tracepoint_string, ideally the same as used with trace_ipi_raise + * for that IPI. + */ +DEFINE_EVENT(ipi_handler, ipi_entry, + + TP_PROTO(const char *reason), + + TP_ARGS(reason) +); + +/** + * ipi_exit - called immediately after the IPI handler returns + * + * @reason: string identifying the IPI purpose + * + * It is necessary for @reason to be a static string declared with + * __tracepoint_string, ideally the same as used with trace_ipi_raise for + * that IPI. + */ +DEFINE_EVENT(ipi_handler, ipi_exit, + + TP_PROTO(const char *reason), + + TP_ARGS(reason) +); + +#endif /* _TRACE_IPI_H */ + +/* This part must be outside protection */ +#include <trace/define_trace.h>
On 07/18/2014 07:18 AM, Nicolas Pitre wrote:
The Inter Processor Interrupt is used to make another processor do a specific action such as rescheduling tasks, signal a timer event or execute something in another CPU's context. IRQs are already traceable but IPIs were not. Tracing them is useful for monitoring IPI latency, or to verify when they are the source of CPU wake-ups with power management implications.
Three trace hooks are defined: ipi_raise, ipi_entry and ipi_exit. To make them portable, a string is used to identify them and correlate related events. Additionally, ipi_raise records a bitmask representing targeted CPUs.
Signed-off-by: Nicolas Pitre nico@linaro.org
Acked-by: Daniel Lezcano daniel.lezcano@linaro.org
include/trace/events/ipi.h | 89 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 89 insertions(+) create mode 100644 include/trace/events/ipi.h
diff --git a/include/trace/events/ipi.h b/include/trace/events/ipi.h new file mode 100644 index 0000000000..834a7362a6 --- /dev/null +++ b/include/trace/events/ipi.h @@ -0,0 +1,89 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM ipi
+#if !defined(_TRACE_IPI_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_IPI_H
+#include <linux/tracepoint.h>
+/**
- ipi_raise - called when a smp cross call is made
- @mask: mask of recipient CPUs for the IPI
- @reason: string identifying the IPI purpose
- It is necessary for @reason to be a static string declared with
- __tracepoint_string.
- */
+TRACE_EVENT(ipi_raise,
- TP_PROTO(const struct cpumask *mask, const char *reason),
- TP_ARGS(mask, reason),
- TP_STRUCT__entry(
__bitmask(target_cpus, nr_cpumask_bits)
__field(const char *, reason)
- ),
- TP_fast_assign(
__assign_bitmask(target_cpus, cpumask_bits(mask), nr_cpumask_bits);
__entry->reason = reason;
- ),
- TP_printk("target_mask=%s (%s)", __get_bitmask(target_cpus), __entry->reason)
+);
+DECLARE_EVENT_CLASS(ipi_handler,
- TP_PROTO(const char *reason),
- TP_ARGS(reason),
- TP_STRUCT__entry(
__field(const char *, reason)
- ),
- TP_fast_assign(
__entry->reason = reason;
- ),
- TP_printk("(%s)", __entry->reason)
+);
+/**
- ipi_entry - called immediately before the IPI handler
- @reason: string identifying the IPI purpose
- It is necessary for @reason to be a static string declared with
- __tracepoint_string, ideally the same as used with trace_ipi_raise
- for that IPI.
- */
+DEFINE_EVENT(ipi_handler, ipi_entry,
- TP_PROTO(const char *reason),
- TP_ARGS(reason)
+);
+/**
- ipi_exit - called immediately after the IPI handler returns
- @reason: string identifying the IPI purpose
- It is necessary for @reason to be a static string declared with
- __tracepoint_string, ideally the same as used with trace_ipi_raise for
- that IPI.
- */
+DEFINE_EVENT(ipi_handler, ipi_exit,
- TP_PROTO(const char *reason),
- TP_ARGS(reason)
+);
+#endif /* _TRACE_IPI_H */
+/* This part must be outside protection */ +#include <trace/define_trace.h>
The strings used to list IPIs in /proc/interrupts are reused for tracing purposes.
Signed-off-by: Nicolas Pitre nico@linaro.org --- arch/arm/kernel/smp.c | 72 ++++++++++++++++++++++++++++++++------------------- 1 file changed, 45 insertions(+), 27 deletions(-)
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 7c4fada440..daedff319b 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -47,6 +47,9 @@ #include <asm/mach/arch.h> #include <asm/mpu.h>
+#define CREATE_TRACE_POINTS +#include <trace/events/ipi.h> + /* * as from 2.5, kernels no longer have an init_tasks structure * so we need some other way of telling a new secondary core @@ -430,38 +433,19 @@ void __init smp_prepare_cpus(unsigned int max_cpus) } }
-static void (*smp_cross_call)(const struct cpumask *, unsigned int); +static void (*__smp_cross_call)(const struct cpumask *, unsigned int);
void __init set_smp_cross_call(void (*fn)(const struct cpumask *, unsigned int)) { - if (!smp_cross_call) - smp_cross_call = fn; -} - -void arch_send_call_function_ipi_mask(const struct cpumask *mask) -{ - smp_cross_call(mask, IPI_CALL_FUNC); -} - -void arch_send_wakeup_ipi_mask(const struct cpumask *mask) -{ - smp_cross_call(mask, IPI_WAKEUP); -} - -void arch_send_call_function_single_ipi(int cpu) -{ - smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE); + if (!__smp_cross_call) + __smp_cross_call = fn; }
-#ifdef CONFIG_IRQ_WORK -void arch_irq_work_raise(void) -{ - if (is_smp()) - smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK); -} +static const char *ipi_types[NR_IPI] +#ifdef CONFIG_TRACING +__tracepoint_string #endif - -static const char *ipi_types[NR_IPI] = { += { #define S(x,s) [x] = s S(IPI_WAKEUP, "CPU wakeup interrupts"), S(IPI_TIMER, "Timer broadcast interrupts"), @@ -473,6 +457,12 @@ static const char *ipi_types[NR_IPI] = { S(IPI_COMPLETION, "completion interrupts"), };
+static void smp_cross_call(const struct cpumask *target, unsigned int ipinr) +{ + trace_ipi_raise(target, ipi_types[ipinr]); + __smp_cross_call(target, ipinr); +} + void show_ipi_list(struct seq_file *p, int prec) { unsigned int cpu, i; @@ -499,6 +489,29 @@ u64 smp_irq_stat_cpu(unsigned int cpu) return sum; }
+void arch_send_call_function_ipi_mask(const struct cpumask *mask) +{ + smp_cross_call(mask, IPI_CALL_FUNC); +} + +void arch_send_wakeup_ipi_mask(const struct cpumask *mask) +{ + smp_cross_call(mask, IPI_WAKEUP); +} + +void arch_send_call_function_single_ipi(int cpu) +{ + smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE); +} + +#ifdef CONFIG_IRQ_WORK +void arch_irq_work_raise(void) +{ + if (is_smp()) + smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK); +} +#endif + #ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST void tick_broadcast(const struct cpumask *mask) { @@ -556,8 +569,10 @@ void handle_IPI(int ipinr, struct pt_regs *regs) unsigned int cpu = smp_processor_id(); struct pt_regs *old_regs = set_irq_regs(regs);
- if (ipinr < NR_IPI) + if ((unsigned)ipinr < NR_IPI) { + trace_ipi_entry(ipi_types[ipinr]); __inc_irq_stat(cpu, ipi_irqs[ipinr]); + }
switch (ipinr) { case IPI_WAKEUP: @@ -612,6 +627,9 @@ void handle_IPI(int ipinr, struct pt_regs *regs) cpu, ipinr); break; } + + if ((unsigned)ipinr < NR_IPI) + trace_ipi_exit(ipi_types[ipinr]); set_irq_regs(old_regs); }
On Fri, 18 Jul 2014 01:18:53 -0400 Nicolas Pitre nicolas.pitre@linaro.org wrote:
-#ifdef CONFIG_IRQ_WORK -void arch_irq_work_raise(void) -{
- if (is_smp())
smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK);
-} +static const char *ipi_types[NR_IPI] +#ifdef CONFIG_TRACING +__tracepoint_string #endif
Oh, this is ugly. I should probably add a define when !CONFIG_TRACING is set. Something like:
#ifdef CONFIG_TRACING ... #else # define __tracepoint_string #endif
Such that users of __tracepoint_string don't need to add ugly ifdefs in the code.
If you want to add that to ftrace_event.h to this series, I'll ack it.
-- Steve
-static const char *ipi_types[NR_IPI] = { += { #define S(x,s) [x] = s S(IPI_WAKEUP, "CPU wakeup interrupts"), S(IPI_TIMER, "Timer broadcast interrupts"), @@ -473,6 +457,12 @@ static const char *ipi_types[NR_IPI] = { S(IPI_COMPLETION, "completion interrupts"), };
On Fri, 18 Jul 2014, Steven Rostedt wrote:
On Fri, 18 Jul 2014 01:18:53 -0400 Nicolas Pitre nicolas.pitre@linaro.org wrote:
-#ifdef CONFIG_IRQ_WORK -void arch_irq_work_raise(void) -{
- if (is_smp())
smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK);
-} +static const char *ipi_types[NR_IPI] +#ifdef CONFIG_TRACING +__tracepoint_string #endif
Oh, this is ugly. I should probably add a define when !CONFIG_TRACING is set. Something like:
#ifdef CONFIG_TRACING ... #else # define __tracepoint_string #endif
Such that users of __tracepoint_string don't need to add ugly ifdefs in the code.
If you want to add that to ftrace_event.h to this series, I'll ack it.
Here's the patch I have at the head of the series now, with the above ugliness changed to an unconditional __tracepoint_string attribute.
From: Nicolas Pitre nicolas.pitre@linaro.org Date: Fri, 18 Jul 2014 16:34:39 -0400 Subject: [PATCH] trace: don't refer __tracepoint_string to an undefined linker section
When CONFIG_TRACING is not set, the linker script doesn't specify any __tracepoint_str section. Let those __tracepoint_string marked strings live in the default rodata section in that case.
Signed-off-by: Nicolas Pitre nico@linaro.org
diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index cff3106ffe..d6346607e4 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -606,7 +606,11 @@ do { \ static const char *___tp_str __tracepoint_string = str; \ ___tp_str; \ }) +#ifdef CONFIG_TRACING #define __tracepoint_string __attribute__((section("__tracepoint_str"))) +#else +#define __tracepoint_string +#endif
#ifdef CONFIG_PERF_EVENTS struct perf_event;
On Fri, 18 Jul 2014 16:55:42 -0400 (EDT) Nicolas Pitre nicolas.pitre@linaro.org wrote:
Here's the patch I have at the head of the series now, with the above ugliness changed to an unconditional __tracepoint_string attribute.
I was thinking of something like this. Feel free to add this to your series.
-- Steve
From: Steven Rostedt rostedt@goodmis.org Subject: [PATCH] tracing: Do not do anything special with tracepoint_string when tracing is disabled
When CONFIG_TRACING is not enabled, there's no reason to save the trace strings either by the linker or as a static variable that can be referenced later. Simply pass back the string that is given to tracepoint_string().
Signed-off-by: Steven Rostedt rostedt@goodmis.org --- diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index cff3106..b296363 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -574,6 +574,7 @@ do { \ __trace_printk(ip, fmt, ##args); \ } while (0)
+#ifdef CONFIG_TRACING /** * tracepoint_string - register constant persistent string to trace system * @str - a constant persistent string that will be referenced in tracepoints @@ -607,6 +608,15 @@ do { \ ___tp_str; \ }) #define __tracepoint_string __attribute__((section("__tracepoint_str"))) +#else +/* + * tracepoint_string() is used to save the string address for userspace + * tracing tools. When tracing isn't configured, there's no need to save + * anything. + */ +# define tracepoint_string(str) str +# define __tracepoint_string +#endif
#ifdef CONFIG_PERF_EVENTS struct perf_event;
On Fri, 18 Jul 2014, Steven Rostedt wrote:
On Fri, 18 Jul 2014 16:55:42 -0400 (EDT) Nicolas Pitre nicolas.pitre@linaro.org wrote:
Here's the patch I have at the head of the series now, with the above ugliness changed to an unconditional __tracepoint_string attribute.
I was thinking of something like this. Feel free to add this to your series.
OK. Same end result, but much clearer. Thanks.
Any comments / ACKs on the other patches? I'd like to see 1/4 to 3/4 (and your patch) merged upstream during the next window. 4/4 is up for debate.
-- Steve
From: Steven Rostedt rostedt@goodmis.org Subject: [PATCH] tracing: Do not do anything special with tracepoint_string when tracing is disabled
When CONFIG_TRACING is not enabled, there's no reason to save the trace strings either by the linker or as a static variable that can be referenced later. Simply pass back the string that is given to tracepoint_string().
Signed-off-by: Steven Rostedt rostedt@goodmis.org
diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index cff3106..b296363 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -574,6 +574,7 @@ do { \ __trace_printk(ip, fmt, ##args); \ } while (0) +#ifdef CONFIG_TRACING /**
- tracepoint_string - register constant persistent string to trace system
- @str - a constant persistent string that will be referenced in tracepoints
@@ -607,6 +608,15 @@ do { \ ___tp_str; \ }) #define __tracepoint_string __attribute__((section("__tracepoint_str"))) +#else +/*
- tracepoint_string() is used to save the string address for userspace
- tracing tools. When tracing isn't configured, there's no need to save
- anything.
- */
+# define tracepoint_string(str) str +# define __tracepoint_string +#endif #ifdef CONFIG_PERF_EVENTS struct perf_event;
On Fri, 18 Jul 2014 22:55:12 -0400 (EDT) Nicolas Pitre nicolas.pitre@linaro.org wrote:
Any comments / ACKs on the other patches? I'd like to see 1/4 to 3/4 (and your patch) merged upstream during the next window. 4/4 is up for debate.
You can add my Acked-by for patches 1,2 and 3, after the clean up of the #ifdefs there.
I still don't like the fact that you need to add the #undef in patch 4. I'm looking at other ways to fix that. I tried to do a few different clean ups in the tracing infrastructure, but it seems that it may be required. The other method I may try is to move the #undefs into the ipi.h header itself.
I'll play more with this on Monday.
-- Steve
On 18 July 2014 23:22, Steven Rostedt rostedt@goodmis.org wrote:
On Fri, 18 Jul 2014 16:55:42 -0400 (EDT) Nicolas Pitre nicolas.pitre@linaro.org wrote:
Here's the patch I have at the head of the series now, with the above ugliness changed to an unconditional __tracepoint_string attribute.
I was thinking of something like this. Feel free to add this to your series.
-- Steve
Nico,
If this patch addresses the issue where 3 RCU related tracepoint strings turn up /after/ _edata on !CONFIG_TRACING, there is already a patch queued up here
http://marc.info/?l=linux-kernel&m=140518452623148&w=2
As far as In know, these were the only occurrences using a __used modifier, which is why they weren't dropped by the compiler in the !CONFIG_TRACING case.
Regards, Ard.
From: Steven Rostedt rostedt@goodmis.org Subject: [PATCH] tracing: Do not do anything special with tracepoint_string when tracing is disabled
When CONFIG_TRACING is not enabled, there's no reason to save the trace strings either by the linker or as a static variable that can be referenced later. Simply pass back the string that is given to tracepoint_string().
Signed-off-by: Steven Rostedt rostedt@goodmis.org
diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index cff3106..b296363 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -574,6 +574,7 @@ do { \ __trace_printk(ip, fmt, ##args); \ } while (0)
+#ifdef CONFIG_TRACING /**
- tracepoint_string - register constant persistent string to trace system
- @str - a constant persistent string that will be referenced in tracepoints
@@ -607,6 +608,15 @@ do { \ ___tp_str; \ }) #define __tracepoint_string __attribute__((section("__tracepoint_str"))) +#else +/*
- tracepoint_string() is used to save the string address for userspace
- tracing tools. When tracing isn't configured, there's no need to save
- anything.
- */
+# define tracepoint_string(str) str +# define __tracepoint_string +#endif
#ifdef CONFIG_PERF_EVENTS struct perf_event;
linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Sat, 19 Jul 2014 21:10:37 +0200 Ard Biesheuvel ard.biesheuvel@linaro.org wrote:
On 18 July 2014 23:22, Steven Rostedt rostedt@goodmis.org wrote:
On Fri, 18 Jul 2014 16:55:42 -0400 (EDT) Nicolas Pitre nicolas.pitre@linaro.org wrote:
Here's the patch I have at the head of the series now, with the above ugliness changed to an unconditional __tracepoint_string attribute.
I was thinking of something like this. Feel free to add this to your series.
-- Steve
Nico,
If this patch addresses the issue where 3 RCU related tracepoint strings turn up /after/ _edata on !CONFIG_TRACING, there is already a patch queued up here
http://marc.info/?l=linux-kernel&m=140518452623148&w=2
As far as In know, these were the only occurrences using a __used modifier, which is why they weren't dropped by the compiler in the !CONFIG_TRACING case.
Ard,
Similar but different problem. Nicolas's problem was with new use cases for tracepoint_string. My patch fixes the issue for the general case.
-- Steve
On 19 July 2014 22:28, Steven Rostedt rostedt@goodmis.org wrote:
On Sat, 19 Jul 2014 21:10:37 +0200 Ard Biesheuvel ard.biesheuvel@linaro.org wrote:
On 18 July 2014 23:22, Steven Rostedt rostedt@goodmis.org wrote:
On Fri, 18 Jul 2014 16:55:42 -0400 (EDT) Nicolas Pitre nicolas.pitre@linaro.org wrote:
Here's the patch I have at the head of the series now, with the above ugliness changed to an unconditional __tracepoint_string attribute.
I was thinking of something like this. Feel free to add this to your series.
-- Steve
Nico,
If this patch addresses the issue where 3 RCU related tracepoint strings turn up /after/ _edata on !CONFIG_TRACING, there is already a patch queued up here
http://marc.info/?l=linux-kernel&m=140518452623148&w=2
As far as In know, these were the only occurrences using a __used modifier, which is why they weren't dropped by the compiler in the !CONFIG_TRACING case.
Ard,
Similar but different problem. Nicolas's problem was with new use cases for tracepoint_string. My patch fixes the issue for the general case.
OK, so if the general case has been fixed, perhaps we should ask Paul to drop my patch?
On Sat, 19 Jul 2014 22:50:16 +0200 Ard Biesheuvel ard.biesheuvel@linaro.org wrote:
OK, so if the general case has been fixed, perhaps we should ask Paul to drop my patch?
No, for a few reasons. One, this patch still needs to get in to fix the problem for RCU. Two, RCU basically open codes the creation of the string, thus this wont solve it for RCU.
-- Steve
On Sat, 19 Jul 2014, Ard Biesheuvel wrote:
On 18 July 2014 23:22, Steven Rostedt rostedt@goodmis.org wrote:
On Fri, 18 Jul 2014 16:55:42 -0400 (EDT) Nicolas Pitre nicolas.pitre@linaro.org wrote:
Here's the patch I have at the head of the series now, with the above ugliness changed to an unconditional __tracepoint_string attribute.
I was thinking of something like this. Feel free to add this to your series.
-- Steve
Nico,
If this patch addresses the issue where 3 RCU related tracepoint strings turn up /after/ _edata on !CONFIG_TRACING, there is already a patch queued up here
No, that doesn't help my case. Please see the initial comment from Steven in this thread and you'll understand.
Nicolas
@Russell: Can I have your ACK on this patch before I send the series upstream? Here's the revised version accounting for the changes the discussion in this thread brought about
Author: Nicolas Pitre nicolas.pitre@linaro.org Date: Fri Jul 18 00:07:05 2014 -0400
ARM: add IPI tracepoints
The strings used to list IPIs in /proc/interrupts are reused for tracing purposes.
While at it, prevent a negative ipinr from escaping the range check in handle_IPI().
Signed-off-by: Nicolas Pitre nico@linaro.org Acked-by: Steven Rostedt rostedt@goodmis.org
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 7c4fada440..9388a3d479 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -47,6 +47,9 @@ #include <asm/mach/arch.h> #include <asm/mpu.h>
+#define CREATE_TRACE_POINTS +#include <trace/events/ipi.h> + /* * as from 2.5, kernels no longer have an init_tasks structure * so we need some other way of telling a new secondary core @@ -430,38 +433,15 @@ void __init smp_prepare_cpus(unsigned int max_cpus) } }
-static void (*smp_cross_call)(const struct cpumask *, unsigned int); +static void (*__smp_cross_call)(const struct cpumask *, unsigned int);
void __init set_smp_cross_call(void (*fn)(const struct cpumask *, unsigned int)) { - if (!smp_cross_call) - smp_cross_call = fn; -} - -void arch_send_call_function_ipi_mask(const struct cpumask *mask) -{ - smp_cross_call(mask, IPI_CALL_FUNC); -} - -void arch_send_wakeup_ipi_mask(const struct cpumask *mask) -{ - smp_cross_call(mask, IPI_WAKEUP); -} - -void arch_send_call_function_single_ipi(int cpu) -{ - smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE); + if (!__smp_cross_call) + __smp_cross_call = fn; }
-#ifdef CONFIG_IRQ_WORK -void arch_irq_work_raise(void) -{ - if (is_smp()) - smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK); -} -#endif - -static const char *ipi_types[NR_IPI] = { +static const char *ipi_types[NR_IPI] __tracepoint_string = { #define S(x,s) [x] = s S(IPI_WAKEUP, "CPU wakeup interrupts"), S(IPI_TIMER, "Timer broadcast interrupts"), @@ -473,6 +453,12 @@ static const char *ipi_types[NR_IPI] = { S(IPI_COMPLETION, "completion interrupts"), };
+static void smp_cross_call(const struct cpumask *target, unsigned int ipinr) +{ + trace_ipi_raise(target, ipi_types[ipinr]); + __smp_cross_call(target, ipinr); +} + void show_ipi_list(struct seq_file *p, int prec) { unsigned int cpu, i; @@ -499,6 +485,29 @@ u64 smp_irq_stat_cpu(unsigned int cpu) return sum; }
+void arch_send_call_function_ipi_mask(const struct cpumask *mask) +{ + smp_cross_call(mask, IPI_CALL_FUNC); +} + +void arch_send_wakeup_ipi_mask(const struct cpumask *mask) +{ + smp_cross_call(mask, IPI_WAKEUP); +} + +void arch_send_call_function_single_ipi(int cpu) +{ + smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE); +} + +#ifdef CONFIG_IRQ_WORK +void arch_irq_work_raise(void) +{ + if (is_smp()) + smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK); +} +#endif + #ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST void tick_broadcast(const struct cpumask *mask) { @@ -556,8 +565,10 @@ void handle_IPI(int ipinr, struct pt_regs *regs) unsigned int cpu = smp_processor_id(); struct pt_regs *old_regs = set_irq_regs(regs);
- if (ipinr < NR_IPI) + if ((unsigned)ipinr < NR_IPI) { + trace_ipi_entry(ipi_types[ipinr]); __inc_irq_stat(cpu, ipi_irqs[ipinr]); + }
switch (ipinr) { case IPI_WAKEUP: @@ -612,6 +623,9 @@ void handle_IPI(int ipinr, struct pt_regs *regs) cpu, ipinr); break; } + + if ((unsigned)ipinr < NR_IPI) + trace_ipi_exit(ipi_types[ipinr]); set_irq_regs(old_regs); }
The strings used to list IPIs in /proc/interrupts are reused for tracing purposes.
While at it, the code is slightly cleaned up so the ipi_types array indices are no longer offset by IPI_RESCHEDULE whose value is 0 anyway.
Signed-off-by: Nicolas Pitre nico@linaro.org --- arch/arm64/kernel/smp.c | 67 +++++++++++++++++++++++++++++++------------------ 1 file changed, 42 insertions(+), 25 deletions(-)
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 40f38f46c8..78a5904994 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -50,6 +50,9 @@ #include <asm/tlbflush.h> #include <asm/ptrace.h>
+#define CREATE_TRACE_POINTS +#include <trace/events/ipi.h> + /* * as from 2.5, kernels no longer have an init_tasks structure * so we need some other way of telling a new secondary core @@ -307,8 +310,6 @@ void __init smp_prepare_boot_cpu(void) set_my_cpu_offset(per_cpu_offset(smp_processor_id())); }
-static void (*smp_cross_call)(const struct cpumask *, unsigned int); - /* * Enumerate the possible CPU set from the device tree and build the * cpu logical map array containing MPIDR values related to logical @@ -463,32 +464,19 @@ void __init smp_prepare_cpus(unsigned int max_cpus) } }
+static void (*__smp_cross_call)(const struct cpumask *, unsigned int);
void __init set_smp_cross_call(void (*fn)(const struct cpumask *, unsigned int)) { - smp_cross_call = fn; -} - -void arch_send_call_function_ipi_mask(const struct cpumask *mask) -{ - smp_cross_call(mask, IPI_CALL_FUNC); -} - -void arch_send_call_function_single_ipi(int cpu) -{ - smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE); + __smp_cross_call = fn; }
-#ifdef CONFIG_IRQ_WORK -void arch_irq_work_raise(void) -{ - if (smp_cross_call) - smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK); -} +static const char *ipi_types[NR_IPI] +#ifdef CONFIG_TRACING +__tracepoint_string #endif - -static const char *ipi_types[NR_IPI] = { -#define S(x,s) [x - IPI_RESCHEDULE] = s += { +#define S(x,s) [x] = s S(IPI_RESCHEDULE, "Rescheduling interrupts"), S(IPI_CALL_FUNC, "Function call interrupts"), S(IPI_CALL_FUNC_SINGLE, "Single function call interrupts"), @@ -497,12 +485,18 @@ static const char *ipi_types[NR_IPI] = { S(IPI_IRQ_WORK, "IRQ work interrupts"), };
+static void smp_cross_call(const struct cpumask *target, unsigned int ipinr) +{ + trace_ipi_raise(target, ipi_types[ipinr]); + __smp_cross_call(target, ipinr); +} + void show_ipi_list(struct seq_file *p, int prec) { unsigned int cpu, i;
for (i = 0; i < NR_IPI; i++) { - seq_printf(p, "%*s%u:%s", prec - 1, "IPI", i + IPI_RESCHEDULE, + seq_printf(p, "%*s%u:%s", prec - 1, "IPI", i, prec >= 4 ? " " : ""); for_each_online_cpu(cpu) seq_printf(p, "%10u ", @@ -522,6 +516,24 @@ u64 smp_irq_stat_cpu(unsigned int cpu) return sum; }
+void arch_send_call_function_ipi_mask(const struct cpumask *mask) +{ + smp_cross_call(mask, IPI_CALL_FUNC); +} + +void arch_send_call_function_single_ipi(int cpu) +{ + smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE); +} + +#ifdef CONFIG_IRQ_WORK +void arch_irq_work_raise(void) +{ + if (__smp_cross_call) + smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK); +} +#endif + static DEFINE_RAW_SPINLOCK(stop_lock);
/* @@ -553,8 +565,10 @@ void handle_IPI(int ipinr, struct pt_regs *regs) unsigned int cpu = smp_processor_id(); struct pt_regs *old_regs = set_irq_regs(regs);
- if (ipinr >= IPI_RESCHEDULE && ipinr < IPI_RESCHEDULE + NR_IPI) - __inc_irq_stat(cpu, ipi_irqs[ipinr - IPI_RESCHEDULE]); + if ((unsigned)ipinr < NR_IPI) { + trace_ipi_entry(ipi_types[ipinr]); + __inc_irq_stat(cpu, ipi_irqs[ipinr]); + }
switch (ipinr) { case IPI_RESCHEDULE: @@ -599,6 +613,9 @@ void handle_IPI(int ipinr, struct pt_regs *regs) pr_crit("CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr); break; } + + if ((unsigned)ipinr < NR_IPI) + trace_ipi_exit(ipi_types[ipinr]); set_irq_regs(old_regs); }
On Fri, Jul 18, 2014 at 06:18:54AM +0100, Nicolas Pitre wrote:
The strings used to list IPIs in /proc/interrupts are reused for tracing purposes.
While at it, the code is slightly cleaned up so the ipi_types array indices are no longer offset by IPI_RESCHEDULE whose value is 0 anyway.
Signed-off-by: Nicolas Pitre nico@linaro.org
Acked-by: Catalin Marinas catalin.marinas@arm.com
On X86 there are already tracepoints for IRQ vectors through which IPIs are handled. However this is highly X86 specific, and the IPI signaling is not currently traced.
This is an attempt at adding generic IPI tracepoints to X86.
Signed-off-by: Nicolas Pitre nico@linaro.org --- arch/x86/kernel/smp.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index be8e1bde07..e154d176cf 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -31,6 +31,12 @@ #include <asm/apic.h> #include <asm/nmi.h> #include <asm/trace/irq_vectors.h> + +#define CREATE_TRACE_POINTS +#undef TRACE_INCLUDE_PATH +#undef TRACE_INCLUDE_FILE +#include <trace/events/ipi.h> + /* * Some notes on x86 processor bugs affecting SMP operation: * @@ -124,11 +130,13 @@ static void native_smp_send_reschedule(int cpu) WARN_ON(1); return; } + trace_ipi_raise(cpumask_of(cpu), tracepoint_string("RESCHEDULE")); apic->send_IPI_mask(cpumask_of(cpu), RESCHEDULE_VECTOR); }
void native_send_call_func_single_ipi(int cpu) { + trace_ipi_raise(cpumask_of(cpu), tracepoint_string("CALL_FUNCTION_SINGLE")); apic->send_IPI_mask(cpumask_of(cpu), CALL_FUNCTION_SINGLE_VECTOR); }
@@ -136,6 +144,8 @@ void native_send_call_func_ipi(const struct cpumask *mask) { cpumask_var_t allbutself;
+ trace_ipi_raise(mask, tracepoint_string("CALL_FUNCTION")); + if (!alloc_cpumask_var(&allbutself, GFP_ATOMIC)) { apic->send_IPI_mask(mask, CALL_FUNCTION_VECTOR); return; @@ -252,8 +262,10 @@ finish: */ static inline void __smp_reschedule_interrupt(void) { + trace_ipi_entry(tracepoint_string("RESCHEDULE")); inc_irq_stat(irq_resched_count); scheduler_ipi(); + trace_ipi_exit(tracepoint_string("RESCHEDULE")); }
__visible void smp_reschedule_interrupt(struct pt_regs *regs) @@ -291,8 +303,10 @@ __visible void smp_trace_reschedule_interrupt(struct pt_regs *regs)
static inline void __smp_call_function_interrupt(void) { + trace_ipi_entry(tracepoint_string("CALL_FUNCTION")); generic_smp_call_function_interrupt(); inc_irq_stat(irq_call_count); + trace_ipi_exit(tracepoint_string("CALL_FUNCTION")); }
__visible void smp_call_function_interrupt(struct pt_regs *regs) @@ -313,8 +327,10 @@ __visible void smp_trace_call_function_interrupt(struct pt_regs *regs)
static inline void __smp_call_function_single_interrupt(void) { + trace_ipi_entry(tracepoint_string("CALL_FUNCTION_SINGLE")); generic_smp_call_function_single_interrupt(); inc_irq_stat(irq_call_count); + trace_ipi_exit(tracepoint_string("CALL_FUNCTION_SINGLE")); }
__visible void smp_call_function_single_interrupt(struct pt_regs *regs)
On Fri, 18 Jul 2014 01:18:55 -0400 Nicolas Pitre nicolas.pitre@linaro.org wrote:
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index be8e1bde07..e154d176cf 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -31,6 +31,12 @@ #include <asm/apic.h> #include <asm/nmi.h> #include <asm/trace/irq_vectors.h>
+#define CREATE_TRACE_POINTS +#undef TRACE_INCLUDE_PATH +#undef TRACE_INCLUDE_FILE
I'm curious to why you added the #undefs. I wouldn't think they were needed.
-- Steve
+#include <trace/events/ipi.h>
/*
- Some notes on x86 processor bugs affecting SMP operation:
On Fri, 18 Jul 2014, Steven Rostedt wrote:
On Fri, 18 Jul 2014 01:18:55 -0400 Nicolas Pitre nicolas.pitre@linaro.org wrote:
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index be8e1bde07..e154d176cf 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -31,6 +31,12 @@ #include <asm/apic.h> #include <asm/nmi.h> #include <asm/trace/irq_vectors.h>
+#define CREATE_TRACE_POINTS +#undef TRACE_INCLUDE_PATH +#undef TRACE_INCLUDE_FILE
I'm curious to why you added the #undefs. I wouldn't think they were needed.
They are needed because asm/trace/irq_vectors.h included prior that point defines them for itself and that makes the inclusion of trace/events/ipi.h fail.
-- Steve
+#include <trace/events/ipi.h>
/*
- Some notes on x86 processor bugs affecting SMP operation:
On Fri, 18 Jul 2014 16:59:50 -0400 (EDT) Nicolas Pitre nicolas.pitre@linaro.org wrote:
On Fri, 18 Jul 2014, Steven Rostedt wrote:
On Fri, 18 Jul 2014 01:18:55 -0400 Nicolas Pitre nicolas.pitre@linaro.org wrote:
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index be8e1bde07..e154d176cf 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -31,6 +31,12 @@ #include <asm/apic.h> #include <asm/nmi.h> #include <asm/trace/irq_vectors.h>
+#define CREATE_TRACE_POINTS +#undef TRACE_INCLUDE_PATH +#undef TRACE_INCLUDE_FILE
I'm curious to why you added the #undefs. I wouldn't think they were needed.
They are needed because asm/trace/irq_vectors.h included prior that point defines them for itself and that makes the inclusion of trace/events/ipi.h fail.
Bah, I tried to get rid of the need for those, but it seems that the macro magic is getting a bit out of hand. I need a new macro magic hat :-p
What you got here will have to do.
-- Steve
On Mon, 21 Jul 2014, Steven Rostedt wrote:
On Fri, 18 Jul 2014 16:59:50 -0400 (EDT) Nicolas Pitre nicolas.pitre@linaro.org wrote:
On Fri, 18 Jul 2014, Steven Rostedt wrote:
On Fri, 18 Jul 2014 01:18:55 -0400 Nicolas Pitre nicolas.pitre@linaro.org wrote:
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index be8e1bde07..e154d176cf 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -31,6 +31,12 @@ #include <asm/apic.h> #include <asm/nmi.h> #include <asm/trace/irq_vectors.h>
+#define CREATE_TRACE_POINTS +#undef TRACE_INCLUDE_PATH +#undef TRACE_INCLUDE_FILE
I'm curious to why you added the #undefs. I wouldn't think they were needed.
They are needed because asm/trace/irq_vectors.h included prior that point defines them for itself and that makes the inclusion of trace/events/ipi.h fail.
Bah, I tried to get rid of the need for those, but it seems that the macro magic is getting a bit out of hand. I need a new macro magic hat :-p
What you got here will have to do.
OK.
Now the real question: should I submit it for mainline? I'm a little bothered by the fact that all exception vectors already have tracepoints of their own, albeit deeply tied to X86 nomenclature. But nothing that relates to IPI sources. This patch fixes that by adding some tracepoints alongside existing ones which may go a long way in making their instrumentation with generic (arch independent) tools.
What do people think?
Nicolas
On Fri, 2014-07-18 at 01:18 -0400, Nicolas Pitre wrote:
On X86 there are already tracepoints for IRQ vectors through which IPIs are handled. However this is highly X86 specific, and the IPI signaling is not currently traced.
This is an attempt at adding generic IPI tracepoints to X86.
I welcome this, and fwiw have been trying out this patch. One thing I would like to see, but due to overhead would probably be better suited in userspace (trace-cmd, maybe?), is a more packed description of the IPI. Ie: unifying ipi_init and ipi_exit and show overall cost of the IPI.
Thanks, Davidlohr
On Sun, 20 Jul 2014, Davidlohr Bueso wrote:
On Fri, 2014-07-18 at 01:18 -0400, Nicolas Pitre wrote:
On X86 there are already tracepoints for IRQ vectors through which IPIs are handled. However this is highly X86 specific, and the IPI signaling is not currently traced.
This is an attempt at adding generic IPI tracepoints to X86.
I welcome this, and fwiw have been trying out this patch. One thing I would like to see, but due to overhead would probably be better suited in userspace (trace-cmd, maybe?), is a more packed description of the IPI. Ie: unifying ipi_init and ipi_exit and show overall cost of the IPI.
That's best suited for the tool consuming the log in user space. The same is done with IRQ events already.
Nicolas
linaro-kernel@lists.linaro.org