您的位置：首页 > 运维架构 > Linux

[linux]linux内核时间管理基础

2014-06-24 16:30 337 查看

一,linux时间管理基础

/article/1880849.html

/article/8792871.html

linux所有时间基础都是以低层硬件为基础的，低层硬件有GPT和cpu local timer,比如GPT的时钟源为13M HZlinux低层时间的架构分为clock source，clock event device,clock source上层分为Xtimer和Hrtimer,Xtimer主要是指墙上时间（开机的时候从RTC寄存器读取墙上时间），Hrtimer主要是高精度的计时器，精度可以到ns级别，Clock event Device向上层提供jiffies以及时间轮的概念，比如进程切换的最小精度是10ms。

struct clocksource，定义了一个clock device的基本属性及行为, 这些clock device一般都有计数,定时, 产生中断能力, 比如GPT

struct clock_event_device Clock event的主要作用是分发clock事件及设置下一次触发条件. 在没有clock event之前,时钟中断都是周期性地产生, 也就是熟知的jiffies和HZ

二：jiffies和HZ的概念

在ARM系统上HZ的大小一般为100，表示1s内有100个节拍，jiffies表示的是系统自从启动以来的节拍总数，jiffies一般为unsigned long类型，所以可能会溢出。比如：unsigned long jiffies; unsigned long timeout=jiffies+HZ/2;表示的是未来的0.5s

jiffies回绕的问题：

01.unsigned long jiffies;
02.unsigned long timeout = jiffies + HZ/2;
03.//......
04.if (timeout > jiffies) {
05.        //没有超时，很好
06.}
07.else {
08.        //超时了，发生错误
09.}

其中jiffies是个不断在增大的unsigned long，timeout可以看作比jiffies“大不了多少”的unsigned long。当jiffies变得比2^32-1还要大的时候会发生溢出，“回绕”(wrap around)到0附近。此时，判断语句为真，虽然实际上超时了，但是判断为没有超时。

Linux内核提供了一组宏解决这个问题。其中宏time_after(a, b)是考虑可能的溢出情况后判断时间a是否在时间b之后(即“b < a”)。

01.#define time_after(a, b) ((long)(b) - (long)(a) < 0)

三，Clock Source Struct

1.  struct clocksource {
6.      cycle_t (*read)(struct clocksource *cs);
7.      cycle_t cycle_last;
8.      cycle_t mask;
9.      u32 mult;
10.     u32 shift;
11.     u64 max_idle_ns;
12.     u32 maxadj;
13. #ifdef CONFIG_ARCH_CLOCKSOURCE_DATA
14.     struct arch_clocksource_data archdata;
15. #endif
17.     const char *name;
18.     struct list_head list;
19.     int rating;
20.     int (*enable)(struct clocksource *cs);
21.     void (*disable)(struct clocksource *cs);
22.     unsigned long flags;
23.     void (*suspend)(struct clocksource *cs);
24.     void (*resume)(struct clocksource *cs);
25.
26.     /* private: */
27. #ifdef CONFIG_CLOCKSOURCE_WATCHDOG
28.     /* Watchdog related data, used by the framework */
29.     struct list_head wd_list;
30.     cycle_t cs_last;
31.     cycle_t wd_last;
32. #endif
33. } ____cacheline_aligned;

1 rating：时钟源的精度

同一个设备下，可以有多个时钟源，每个时钟源的精度由驱动它的时钟频率决定，比如一个由10MHz时钟驱动的时钟源，他的精度就是100nS。clocksource结构中有一个rating字段，代表着该时钟源的精度范围，它的取值范围如下：

1--99：不适合于用作实际的时钟源，只用于启动过程或用于测试；

100--199：基本可用，可用作真实的时钟源，但不推荐；

200--299：精度较好，可用作真实的时钟源；

300--399：很好，精确的时钟源；

400--499：理想的时钟源，如有可能就必须选择它作为时钟源；

2 read回调函数

时钟源本身不会产生中断，要获得时钟源的当前计数，只能通过主动调用它的read回调函数来获得当前的计数值，注意这里只能获得计数值，也就是所谓的cycle，要获得相应的时间，必须要借助clocksource的mult和shift字段进行转换计算。

1.3 mult和shift字段

因为从clocksource中读到的值是一个cycle计数值，要转换为时间，我们必须要知道驱动clocksource的时钟频率F，一个简单的计算就可以完成：

t = cycle/F;

可是clocksource并没有保存时钟的频率F，因为使用上面的公式进行计算，需要使用浮点运算，这在内核中是不允许的，因此，内核使用了另外一个变通的办法，根据时钟的频率和期望的精度，事先计算出两个辅助常数mult和shift，然后使用以下公式进行cycle和t的转换：

t = (cycle * mult) >> shift;只要我们保证：F = (1 << shift) / mult;内核内部使用64位进行该转换计算：

xtime 是人们日常所使用的墙上时间

monotonic time 该时间自系统开机后就一直单调地增加，它不像xtime可以因用户的调整时间而产生跳变，不过该时间不计算系统休眠的时间，也就是说，系统休眠时，monotoic时间不会递增。内核用timekeeper结构来组织与时间相关的数据，它的定义如下

四, Struct timekeeper

14struct timekeeper {
15      /* Current clocksource used for timekeeping. */
16      struct clocksource     *clock;
17      /* NTP adjusted clock multiplier */
18      u32                    mult;
19      /* The shift value of the current clocksource. */
20      u32                    shift;
21      /* Number of clock cycles in one NTP interval. */
22      cycle_t                cycle_interval;
23      /* Last cycle value (also stored in clock->cycle_last) */
24      cycle_t                cycle_last;
25      /* Number of clock shifted nano seconds in one NTP interval. */
26      u64                    xtime_interval;
27      /* shifted nano seconds left over when rounding cycle_interval */
28      s64                    xtime_remainder;
29      /* Raw nano seconds accumulated per NTP interval. */
30      u32                    raw_interval;
31
32      /* Current CLOCK_REALTIME time in seconds */
33      u64                    xtime_sec;
34      /* Clock shifted nano seconds */
35      u64                    xtime_nsec;
36
37      /* Difference between accumulated time and NTP time in ntp
38      * shifted nano seconds. */
39      s64                    ntp_error;
40      /* Shift conversion between clock shifted nano seconds and
41      * ntp shifted nano seconds. */
42      u32                    ntp_error_shift;
43
44      /*
45      * wall_to_monotonic is what we need to add to xtime (or xtime corrected
46      * for sub jiffie times) to get to monotonic time.  Monotonic is pegged
47      * at zero at system boot time, so wall_to_monotonic will be negative,
48      * however, we will ALWAYS keep the tv_nsec part positive so we can use
49      * the usual normalization.
50      *
51      * wall_to_monotonic is moved after resume from suspend for the
52      * monotonic time not to jump. We need to add total_sleep_time to
53      * wall_to_monotonic to get the real boot based time offset.
54      *
55      * - wall_to_monotonic is no longer the boot time, getboottime must be
56      * used instead.
57      */
58      struct timespec        wall_to_monotonic;
59      /* Offset clock monotonic -> clock realtime */
60      ktime_t                offs_real;
61      /* time spent in suspend */
62      struct timespec        total_sleep_time;
63      /* Offset clock monotonic -> clock boottime */
64      ktime_t                offs_boot;
65      /* The raw monotonic time for the CLOCK_MONOTONIC_RAW posix clock. */
66      struct timespec        raw_time;
67      /* The current UTC to TAI offset in seconds */
68      s32                    tai_offset;
69      /* Offset clock monotonic -> clock tai */
70      ktime_t                offs_tai;
71
72};

内核定义了一个变量wall_to_monotonic，记录了墙上时间和monotonic时间之间的偏移量，当需要获得monotonic时间时，把xtime和wall_to_monotonic相加即可，

Timekeeper的初始化,timekeeper的初始化由timekeeping_init完成，该函数在start_kernel的初始化序列中被调用，时间的更新：xtime一旦初始化完成后，timekeeper就开始独立于RTC，利用自身关联的clocksource进行时间的更新操作，根据内核的配置项的不同，更新时间的操作发生的频度也不尽相同，如果没有配置NO_HZ选项，通常每个tick的定时中断周期，do_timer会被调用一次。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航