您的位置:首页 > 其它

ftrace framework

2016-03-19 13:41 501 查看
//--------------------------------------------------------------------------

// ftrace.txt

Introduction

------------

Ftrace is an internal tracer designed to help out developers and

designers of systems to find what is going on inside the kernel.

It can be used for debugging or analyzing latencies and

performance issues that take place outside of user-space.

Although ftrace is typically considered the function tracer, it

is really a frame work of several assorted tracing utilities.

There's latency tracing to examine what occurs between interrupts

disabled and enabled, as well as for preemption and from a time

a task is woken to the task is actually scheduled in.

One of the most common uses of ftrace is the event tracing.

Through out the kernel is hundreds of static event points that

can be enabled via the debugfs file system to see what is

going on in certain parts of the kernel.

Implementation Details

----------------------

See ftrace-design.txt for details for arch porters and such.

The File System

---------------

Ftrace uses the debugfs file system to hold the control files as

well as the files to display output.

The Tracers

-----------

Examples of using the tracer

----------------------------

//--------------------------------------------------------------------------

//ftrace-design.txt

在编译的时候,就注入了接口, 另外:HAVE_FUNCTION_TRACER和FUNCTION_TRACER是两个不同的宏

ifdef CONFIG_FUNCTION_TRACER

ORIG_CFLAGS := $(KBUILD_CFLAGS)

KBUILD_CFLAGS = $(subst -pg,,$(ORIG_CFLAGS))

endif

HAVE_FUNCTION_TRACER

--------------------

You will need to implement the mcount and the ftrace_stub functions.

The exact mcount symbol name will depend on your toolchain. Some call it

"mcount", "_mcount", or even "__mcount". You can probably figure it out by

running something like:

$ echo 'main(){}' | gcc -x c -S -o - - -pg | grep mcount

call mcount

We'll make the assumption below that the symbol is "mcount" just to keep things

nice and simple in the examples.

Keep in mind that the ABI that is in effect inside of the mcount function is

*highly* architecture/toolchain specific. We cannot help you in this regard,

sorry. Dig up some old documentation and/or find someone more familiar than

you to bang ideas off of. Typically, register usage (argument/scratch/etc...)

is a major issue at this point, especially in relation to the location of the

mcount call (before/after function prologue). You might also want to look at

how glibc has implemented the mcount function for your architecture. It might

be (semi-)relevant.

The mcount function should check the function pointer ftrace_trace_function

to see if it is set to ftrace_stub. If it is, there is nothing for you to do,

so return immediately. If it isn't, then call that function in the same way

the mcount function normally calls __mcount_internal -- the first argument is

the "frompc" while the second argument is the "selfpc" (adjusted to remove the

size of the mcount call that is embedded in the function).

For example, if the function foo() calls bar(), when the bar() function calls

mcount(), the arguments mcount() will pass to the tracer are:

"frompc" - the address bar() will use to return to foo()

"selfpc" - the address bar() (with mcount() size adjustment)

Also keep in mind that this mcount function will be called *a lot*, so

optimizing for the default case of no tracer will help the smooth running of

your system when tracing is disabled. So the start of the mcount function is

typically the bare minimum with checking things before returning. That also

means the code flow should usually be kept linear (i.e. no branching in the nop

case). This is of course an optimization and not a hard requirement.

Here is some pseudo code that should help (these functions should actually be

implemented in assembly):

void ftrace_stub(void)

{

return;

}

void mcount(void)

{

/* save any bare state needed in order to do initial checking */

extern void (*ftrace_trace_function)(unsigned long, unsigned long);

if (ftrace_trace_function != ftrace_stub)

goto do_trace;

/* restore any bare state */

return;

do_trace:

/* save all state needed by the ABI (see paragraph above) */

unsigned long frompc = ...;

unsigned long selfpc = <return address> - MCOUNT_INSN_SIZE;

ftrace_trace_function(frompc, selfpc);

/* restore all state needed by the ABI */

}

//--------------------------------------------------------------------------

tracepoints.txt

This document introduces Linux Kernel Tracepoints and their use. It

provides examples of how to insert tracepoints in the kernel and

connect probe functions to them and provides some examples of probe

functions.

* Purpose of tracepoints

A tracepoint placed in code provides a hook to call a function (probe)

that you can provide at runtime. A tracepoint can be "on" (a probe is

connected to it) or "off" (no probe is attached). When a tracepoint is

"off" it has no effect, except for adding a tiny time penalty

(checking a condition for a branch) and space penalty (adding a few

bytes for the function call at the end of the instrumented function

and adds a data structure in a separate section). When a tracepoint

is "on", the function you provide is called each time the tracepoint

is executed, in the execution context of the caller. When the function

provided ends its execution, it returns to the caller (continuing from

the tracepoint site).

You can put tracepoints at important locations in the code. They are

lightweight hooks that can pass an arbitrary number of parameters,

which prototypes are described in a tracepoint declaration placed in a

header file.

They can be used for tracing and performance accounting.

* Usage

Two elements are required for tracepoints :

- A tracepoint definition, placed in a header file.

- The tracepoint statement, in C code.

In order to use tracepoints, you should include linux/tracepoint.h.

In include/trace/events/subsys.h :

#undef TRACE_SYSTEM

#define TRACE_SYSTEM subsys

#if !defined(_TRACE_SUBSYS_H) || defined(TRACE_HEADER_MULTI_READ)

#define _TRACE_SUBSYS_H

#include <linux/tracepoint.h>

DECLARE_TRACE(subsys_eventname,

TP_PROTO(int firstarg, struct task_struct *p),

TP_ARGS(firstarg, p));

#endif /* _TRACE_SUBSYS_H */

/* This part must be outside protection */

#include <trace/define_trace.h>

In subsys/file.c (where the tracing statement must be added) :

#include <trace/events/subsys.h>

#define CREATE_TRACE_POINTS

DEFINE_TRACE(subsys_eventname);

void somefct(void)

{

...

trace_subsys_eventname(arg, task);

...

}

Where :

- subsys_eventname is an identifier unique to your event

- subsys is the name of your subsystem.

- eventname is the name of the event to trace.

- TP_PROTO(int firstarg, struct task_struct *p) is the prototype of the

function called by this tracepoint.

- TP_ARGS(firstarg, p) are the parameters names, same as found in the

prototype.

- if you use the header in multiple source files, #define CREATE_TRACE_POINTS

should appear only in one source file.

Connecting a function (probe) to a tracepoint is done by providing a

probe (function to call) for the specific tracepoint through

register_trace_subsys_eventname(). Removing a probe is done through

unregister_trace_subsys_eventname(); it will remove the probe.

tracepoint_synchronize_unregister() must be called before the end of

the module exit function to make sure there is no caller left using

the probe. This, and the fact that preemption is disabled around the

probe call, make sure that probe removal and module unload are safe.

The tracepoint mechanism supports inserting multiple instances of the

same tracepoint, but a single definition must be made of a given

tracepoint name over all the kernel to make sure no type conflict will

occur. Name mangling of the tracepoints is done using the prototypes

to make sure typing is correct. Verification of probe type correctness

is done at the registration site by the compiler. Tracepoints can be

put in inline functions, inlined static functions, and unrolled loops

as well as regular functions.

The naming scheme "subsys_event" is suggested here as a convention

intended to limit collisions. Tracepoint names are global to the

kernel: they are considered as being the same whether they are in the

core kernel image or in modules.

If the tracepoint has to be used in kernel modules, an

EXPORT_TRACEPOINT_SYMBOL_GPL() or EXPORT_TRACEPOINT_SYMBOL() can be

used to export the defined tracepoints.

If you need to do a bit of work for a tracepoint parameter, and

that work is only used for the tracepoint, that work can be encapsulated

within an if statement with the following:

if (trace_foo_bar_enabled()) {

int i;

int tot = 0;

for (i = 0; i < count; i++)

tot += calculate_nuggets();

trace_foo_bar(tot);

}

All trace_<tracepoint>() calls have a matching trace_<tracepoint>_enabled()

function defined that returns true if the tracepoint is enabled and

false otherwise. The trace_<tracepoint>() should always be within the

block of the if (trace_<tracepoint>_enabled()) to prevent races between

the tracepoint being enabled and the check being seen.

The advantage of using the trace_<tracepoint>_enabled() is that it uses

the static_key of the tracepoint to allow the if statement to be implemented

with jump labels and avoid conditional branches.

//------------------------------------------------------------------------

event:

1. Introduction

===============

Tracepoints (see Documentation/trace/tracepoints.txt) can be used

without creating custom kernel modules to register probe functions

using the event tracing infrastructure.

Not all tracepoints can be traced using the event tracing system;

the kernel developer must provide code snippets which define how the

tracing information is saved into the tracing buffer, and how the

tracing information should be printed.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: