您的位置：首页 > 数据库 > Redis

Redis过期键删除策略及源码剖析

2017-11-03 11:34 751 查看

一、什么是过期键，过期键怎么存储的

Redis是可以给键值对设置过期时间的，这个事通过EXPIRE，PEXPIRE, EXPIREAT, PEXPIREAT这四个命令来实现的。

Redis数据库主要是由两个字典构成的，一个字典保存键值对，另一个字典就是保存的过期键的过期时间，我们称这个字典叫过期字典。

typedef struct redisDb
{
dict *dict;
dict *expires;
....
}

二、过期键的删除策略

总的来说，过期键有三种删除策略，分别是定时删除，惰性删除，定期删除。

1、定时删除：通过维护一个定时器，过期马上删除，是最有效的，但是也是最浪费cpu时间的。

2、惰性删除：程序在取出键时才判断它是否过期，过期才删除，这个方法对cpu时间友好，对内存不友好。

3、定期删除：每隔一定时间执行一次删除过期键的操作，并限制每次删除操作的执行时长和频率，是一种折中。

Redis采用了惰性删除和定期删除的策略

三、惰性删除

每次执行读写数据库的Redis命令在执行之前都判断该键是否过期，如果过期，就将它删除，如果没有过期，那它不做任何操作。这就相当于是一个过滤的东西。

每次读写检查是否过期，这个思想有点像渐进式rehash，将很多操作分散到每次执行命令的操作中，尽可能地增加效率。

这个惰性删除函数是在db.c 的expireIfNeeded函数中实现的。它长这个样子：

int expireIfNeeded(redisDb *db, robj *key) {

// 取出键的过期时间
mstime_t when = getExpire(db,key);
mstime_t now;

// 没有过期时间
if (when < 0) return 0; /* No expire for this key */

/* Don't expire anything while loading. It will be done later. */
// 如果服务器正在进行载入，那么不进行任何过期检查
if (server.loading) return 0;

/* If we are in the context of a Lua script, we claim that time is
* blocked to when the Lua script started. This way a key can expire
* only the first time it is accessed and not in the middle of the
* script execution, making propagation to slaves / AOF consistent.
* See issue #1525 on Github for more information. */
now = server.lua_caller ? server.lua_time_start : mstime();

/* If we are running in the context of a slave, return ASAP:
* the slave key expiration is controlled by the master that will
* send us synthesized DEL operations for expired keys.
*
* Still we try to return the right information to the caller,
* that is, 0 if we think the key should be still valid, 1 if
* we think the key is expired at this time. */
// 当服务器运行在 replication 模式时
// 附属节点并不主动删除 key
// 它只返回一个逻辑上正确的返回值
// 真正的删除操作要等待主节点发来删除命令时才执行
// 从而保证数据的同步
if (server.masterhost != NULL) return now > when;

// 运行到这里，表示键带有过期时间，并且服务器为主节点

/* Return when this key has not expired */
// 如果未过期，返回 0
if (now <= when) return 0;

/* Delete the key */
server.stat_expiredkeys++;

// 向 AOF 文件和附属节点传播过期信息
propagateExpire(db,key);

// 发送事件通知
notifyKeyspaceEvent(REDIS_NOTIFY_EXPIRED,
"expired",key,db->id);

// 将过期键从数据库中删除
return dbDelete(db,key);
}

这个函数的其实挺简单的，先判断这个键有没有过期时间，也就是在expires这个字典中有有没存它的过期时间。如果没有，返回，如果有的话，执行删除操作。

这里还要注意的是，如果运行在复制模式下的话，从服务器是不主动删除过期键的，需要接收到主服务器的删除命令才进行删除操作。主服务器自己删除后，会发送一个通知，告知从服务器已经删除。

expireIfNeeded函数中还调用了getExpire函数，这个函数可以用来复习或者熟悉一下Redis的底层数据结构。

long long getExpire(redisDb *db, robj *key) {
dictEntry *de;

/* No expire? return ASAP */
// 获取键的过期时间
// 如果过期时间不存在，那么直接返回
// dictSize: ht[0].used + ht[1].used
if (dictSize(db->expires) == 0 ||
(de = dictFind(db->expires,key->ptr)) == NULL) return -1;

/* The entry was found in the expire dict, this means it should also
* be present in the main dict (safety check). */
redisAssertWithInfo(NULL,key,dictFind(db->dict,key->ptr) != NULL);

// 返回过期时间
return dictGetSignedIntegerVal(de);
}

这个函数接受一个数据库redisDb, 和一个键对象，键对象都是字符串对象。对象的结构如下：

typedef struct redisObject
{
unsigned type: 4;
unsigned enconding : 4;
...
void *ptr;
}

这个ptr是指向底层实现的，比如sds, embstr, list等等。通过调用‘de = dictFind(db->expires,key->ptr）’这个函数，就能返回一个字典的条目（dictEntry）。因为expires也是用dict实现的，也是通过dictEntry来保存键值对的。

看个惰性删除的使用场景：

取出键的值时先检查是否可以执行惰性删除操作。

*
* 为执行读取操作而取出键 key 在数据库 db 中的值。
*
* 并根据是否成功找到值，更新服务器的命中/不命中信息。
*
* 找到时返回值对象，没找到返回 NULL 。
*/
robj *lookupKeyRead(redisDb *db, robj *key) {
robj *val;

// 检查 key 释放已经过期
// 一个惰性删除策略，每次读写数据库时，调用该函数检查输入键是否
// 过期，如果过期，则将会删除该键，如果没有过期，该函数不做任何动作
expireIfNeeded(db,key);

// 从数据库中取出键的值
val = lookupKey(db,key);

// 更新命中/不命中信息
// 命中信息：stat_keyspace_hits
// 不命中：stat_keyspace_misses
if (val == NULL)
server.stat_keyspace_misses++;
else
server.stat_keyspace_hits++;

// 返回值
return val;
}

四、定期删除

定期删除是通过acticeExpireCycle函数来实现的，这个函数需要一些策略，需要限制每次操作的时长和频率。所以比较复杂。但是注释也很详细，还是能够看懂的。

void activeExpireCycle(int type) {
/* This function has some global state in order to continue the work
* incrementally across calls. */
// 静态变量，用来累积函数连续执行时的数据
// current_db用来记录检查进度
static unsigned int current_db = 0; /* Last DB tested. */
static int timelimit_exit = 0;      /* Time limit hit in previous call? */
// 上次快速模式执行的时间
static long long last_fast_cycle = 0; /* When last fast cycle ran. */

unsigned int j, iteration = 0;
// 默认每次处理的数据库数量，在redis.h中默认为16
unsigned int dbs_per_call = REDIS_DBCRON_DBS_PER_CALL;
// 函数开始的时间
long long start = ustime(), timelimit;

// 快速模式
if (type == ACTIVE_EXPIRE_CYCLE_FAST) {
/* Don't start a fast cycle if the previous cycle did not exited
* for time limt. Also don't repeat a fast cycle for the same period
* as the fast cycle total duration itself. */
// 如果上次函数没有触发 timelimit_exit ，那么不执行处理
if (!timelimit_exit) return;
// 如果距离上次执行未够一定时间，那么不执行处理
// ACTIVE_EXPIRE_CYCLE_FAST_DURATION = 1000
if (start < last_fast_cycle + ACTIVE_EXPIRE_CYCLE_FAST_DURATION*2) return;
// 运行到这里，说明执行快速处理，记录当前时间
last_fast_cycle = start;
}

/* We usually should test REDIS_DBCRON_DBS_PER_CALL per iteration, with
* two exceptions:
*
* 一般情况下，函数只处理 REDIS_DBCRON_DBS_PER_CALL 个数据库，
* 除非：
*
* 1) Don't test more DBs than we have.
*    当前数据库的数量小于 REDIS_DBCRON_DBS_PER_CALL
* 2) If last time we hit the time limit, we want to scan all DBs
* in this iteration, as there is work to do in some DB and we don't want
* expired keys to use memory for too much time.
*     如果上次处理遇到了时间上限，那么这次需要对所有数据库进行扫描，
*     这可以避免过多的过期键占用空间
*/
if (dbs_per_call > server.dbnum || timelimit_exit)
dbs_per_call = server.dbnum;

/* We can use at max ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC percentage of CPU time
* per iteration. Since this function gets called with a frequency of
* server.hz times per second, the following is the max amount of
* microseconds we can spend in this function. */
// 函数处理的微秒时间上限
// ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC 默认为 25 ，也即是 25 % 的 CPU 时间
timelimit = 1000000*ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC/server.hz/100;
// 为0不退出，为1就是超过时间限制退出
timelimit_exit = 0;
if (timelimit <= 0) timelimit = 1;

// 如果是运行在快速模式之下
// 那么最多只能运行 FAST_DURATION 微秒
// 默认值为 1000 （微秒）
if (type == ACTIVE_EXPIRE_CYCLE_FAST)
timelimit = ACTIVE_EXPIRE_CYCLE_FAST_DURATION; /* in microseconds. */

// 遍历数据库
for (j = 0; j < dbs_per_call; j++) {
int expired;
// 指向要处理的数据库
redisDb *db = server.db+(current_db % server.dbnum);

/* Increment the DB now so we are sure if we run out of time
* in the current DB we'll restart from the next. This allows to
* distribute the time evenly across DBs. */
// 为 DB 计数器加一，如果进入 do 循环之后因为超时而跳出
// 那么下次会直接从下个 DB 开始处理
current_db++;

/* Continue to expire if at the end of the cycle more than 25%
* of the keys were expired. */
do {
unsigned long num, slots;
long long now, ttl_sum;
int ttl_samples;

/* If there is nothing to expire try next DB ASAP. */
// 获取数据库中带过期时间的键的数量
// 如果该数量为 0 ，直接跳过这个数据库
// dictSize: ht[0].used + ht[1].used
if ((num = dictSize(db->expires)) == 0) {
// 数据库的键平均TTL，一个统计信息
// TTL: 以秒为单位的键的剩余生存时间
db->avg_ttl = 0;
break;
}
// 获取数据库中键值对的数量
// dictSlots:ht[0].size + ht[1].size
// 注意和dictSize的区别
slots = dictSlots(db->expires);
// 当前时间，mstime:返回毫秒格式的 UNIX 时间
now = mstime();

/* When there are less than 1% filled slots getting random
* keys is expensive, so stop here waiting for better times...
* The dictionary will be resized asap. */
// 这个数据库的使用率低于 1% ，扫描起来太费力了（大部分都会 MISS）
// 跳过，等待字典收缩程序运行
// DICT_HT_INITIAL_SIZE 默认为 4
if (num && slots > DICT_HT_INITIAL_SIZE &&
(num*100/slots < 1)) break;

/* The main collection cycle. Sample random keys among keys
* with an expire set, checking for expired ones.
*
* 样本计数器
*/
// 已处理过期键计数器
expired = 0;
// 键的总 TTL 计数器
ttl_sum = 0;
// 总共处理的键计数器
ttl_samples = 0;

// 每次最多只能检查 LOOKUPS_PER_LOOP 个键,默认为20
if (num > ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP)
num = ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP;

// 开始遍历数据库
while (num--) {
dictEntry *de;
long long ttl;

// 从 expires 中随机取出一个带过期时间的键
if ((de = dictGetRandomKey(db->expires)) == NULL) break;
// 计算 TTL
// dictGetSignedIntegerVal是获得一个dictEntry，然后返回它的有符号整数值
ttl = dictGetSignedIntegerVal(de)-now;
// 如果键已经过期，那么删除它，并将 expired 计数器增一
if (activeExpireCycleTryExpire(db,de,now)) expired++;
if (ttl < 0) ttl = 0;
// 累积键的 TTL
ttl_sum += ttl;
// 累积处理键的个数
ttl_samples++;
}

/* Update the average TTL stats for this database. */
// 为这个数据库更新平均 TTL 统计数据
if (ttl_samples) {
// 计算当前平均值
long long avg_ttl = ttl_sum/ttl_samples;

// 如果这是第一次设置数据库平均 TTL ，那么进行初始化
if (db->avg_ttl == 0) db->avg_ttl = avg_ttl;
/* Smooth the value averaging with the previous one. */
// 取数据库的上次平均 TTL 和今次平均 TTL 的平均值
db->avg_ttl = (db->avg_ttl+avg_ttl)/2;
}

/* We can't block forever here even if there are many keys to
* expire. So after a given amount of milliseconds return to the
* caller waiting for the other active expire cycle. */
// 我们不能用太长时间处理过期键，
// 所以这个函数执行一定时间之后就要返回

// 更新遍历次数，遍历一次就是一个数据库
iteration++;

// 每遍历 16 次执行一次
if ((iteration & 0xf) == 0 && /* check once every 16 iterations. */
(ustime()-start) > timelimit)
{
// 如果遍历次数正好是 16 的倍数
// 并且遍历的时间超过了 timelimit
// 那么断开 timelimit_exit
timelimit_exit = 1;
}

// 已经超时了，返回
if (timelimit_exit) return;

/* We don't repeat the cycle if there are less than 25% of keys
* found expired in the current DB. */
// 如果已删除的过期键占当前总数据库带过期时间的键数量的 25 %
// 那么不再遍历
} while (expired > ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP/4);
}
}

这里要注意的是时间限制是通过

timelimit = 1000000*ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC/server.hz/100;

来实现的，就是0.25个cpu时间。

频率限制是通过iteration来实现的，如果遍历了16个数据库的话，就退出。

redisDb这个结构体中还有一个avg_ttl成员，这个成员是用来统计数据库的键平均TTL, 是一个统计信息，在activeExpireCycle是通过抽样来计算的，就是在随机抽取（调用的dictGetRandomKey函数）到的键值对中计算的这个值。

如果这个数据库的使用率（used/size）很低的话，扫描会很费劲，会等待字典收缩再进行下一步操作。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航