您的位置：首页 > 运维架构

维护100亿个URL(Radix TRee)

2015-09-16 09:55 405 查看

http://s.sousb.com/2011/04/19/%E7%BB%B4%E6%8A%A4100%E4%BA%BF%E4%B8%AAurl/

题目：url地址比如http://www.baidu.com/s?wd=baidu的属性，包括定长属性（比如其被系统发现的时间）和不定长属性（比如其描述）实现一个系统a.储存和维护100亿个url及其属性。b.实现url及其属性的增删改。c.查一个url是否在系统中并给出信息。d.快速选出一个站点下所有url

提示：因为数据量大，可能存储在多台计算机中。

分析：这是一道百度的笔试题，这道题比较难，笔者只能给出几个认识到的点。

首先，这些url要经过partition分到X台机器中：考虑使用一个hash函数hash(hostname(url))将url分配到X台机器中，这样做的目的：一是数据的分布式存储，二是同一个站点的所有url保存到同一台机器中。
其次，每台机器应该如何组织这些数据？一种思路是用数据库的思路去解决，这里提供另外一种思路。考虑将url直接放在内存，接将url组织成树状结构，对于字符串来说，最长使用的是Trietree，由于所占空间由最长url决定，在这里绝对不适用，再加上很多url拥有相同的属性（如路径等）这样，使用trietree的一个变种radix
tree，相比会非常节省空间，并且不会影响效率。
最后，给出了存储模型，上面的abcd四问该怎么回答，这里就不一一解答了。

Radixtree

FromWikipedia,thefreeencyclopedia

(RedirectedfromPatriciatrie)

Incomputerscience,aradixtree(alsopatricia
trieorradixtrieorcompactprefixtree)isaspace-optimizedtriedata
structurewhereeachnodewithonlyonechildismergedwithitschild.Theresultisthateveryinternalnodehasatleasttwochildren.Unlikeinregulartries,edgescanbelabeledwithsequencesofelementsaswellassingleelements.Thismakesthem
muchmoreefficientforsmallsets(especiallyifthestringsarelong)andforsetsofstringsthatsharelongprefixes.
Asanoptimization,edgelabelscanbestoredinconstantsizebyusingtwopointerstoastring(forthefirstandlastelements).[1]
Notethatalthoughtheexamplesinthisarticleshowstringsassequencesofcharacters,thetypeofthestringelementscanbechosenarbitrarily(forexample,asabitorbyteofthestring
representationwhenusingmultibytecharacterencodingsorUnicode).

[hide]

1Applications
2Operations

2.1Lookup
2.2Insertion
2.3Deletion
2.4Additional
Operations

3History
4Comparison
tootherdatastructures
5Variants
6Seealso
7References
8External
links

8.1Implementations

Applications[edit]

Asmentioned,radixtreesareusefulforconstructingassociative
arrayswithkeysthatcanbeexpressedasstrings.TheyfindparticularapplicationintheareaofIProuting,
wheretheabilitytocontainlargerangesofvalueswithafewexceptionsisparticularlysuitedtothehierarchicalorganizationofIP
addresses.[2]Theyarealso
usedforinvertedindexesoftextdocumentsininformation
retrieval.

Operations[edit]

Radixtreessupportinsertion,deletion,andsearchingoperations.Insertionaddsanewstringtothetriewhiletryingtominimizetheamountofdatastored.Deletionremovesastringfrom
thetrie.Searchingoperationsincludeexactlookup,findpredecessor,findsuccessor,andfindallstringswithaprefix.AlloftheseoperationsareO(k)wherekisthemaximumlengthofallstringsintheset.Thislistmaynotbeexhaustive.

Lookup[edit]

FindingastringinaPatriciatrie

Thelookupoperationdeterminesifastringexistsinatrie.Mostoperationsmodifythisapproachinsomewaytohandletheirspecifictasks.Forinstance,thenodewhereastringterminates
maybeofimportance.Thisoperationissimilartotriesexceptthatsomeedgesconsumemultipleelements.
Thefollowingpseudocodeassumesthattheseclassesexist.
Edge

NodetargetNode
stringlabel

Node

ArrayofEdgesedges
functionisLeaf()

functionlookup(stringx)
{
//Beginattherootwithnoelementsfound
NodetraverseNode:=root;
intelementsFound:=0;

//Traverseuntilaleafisfoundoritisnotpossibletocontinue
while(traverseNode!=null&&!traverseNode.isLeaf()&&elementsFound<x.length)
{
//Getthenextedgetoexplorebasedontheelementsnotyetfoundinx
EdgenextEdge:=selectedgefromtraverseNode.edgeswhereedge.labelisaprefixofx.suffix(elementsFound)
//x.suffix(elementsFound)returnsthelast(x.length-elementsFound)elementsofx

//Wasanedgefound?
if(nextEdge!=null)
{
//Setthenextnodetoexplore
traverseNode:=nextEdge.targetNode;

//Incrementelementsfoundbasedonthelabelstoredattheedge
elementsFound+=nextEdge.label.length;
}
else
{
//Terminateloop
traverseNode:=null;
}
}

//Amatchisfoundifwearriveataleafnodeandhaveusedupexactlyx.lengthelements
return(traverseNode!=null&&traverseNode.isLeaf()&&elementsFound==x.length);
}

Insertion[edit]

Toinsertastring,wesearchthetreeuntilwecanmakenofurtherprogress.Atthispointweeitheraddanewoutgoingedgelabeledwithallremainingelementsintheinputstring,orif
thereisalreadyanoutgoingedgesharingaprefixwiththeremaininginputstring,wesplititintotwoedges(thefirstlabeledwiththecommonprefix)andproceed.Thissplittingstepensuresthatnonodehasmorechildrenthantherearepossiblestring
elements.
Severalcasesofinsertionareshownbelow,thoughmoremayexist.Notethatrsimplyrepresentstheroot.Itisassumedthatedgescanbelabelledwithemptystringstoterminatestringswhere
necessaryandthattheroothasnoincomingedge.

Insert'water'attheroot

Insert'slower'whilekeeping'slow'

Insert'test'whichisaprefixof'tester'

Insert'team'whilesplitting'test'andcreatinganewedgelabel'st'

Insert'toast'whilesplitting'te'andmovingpreviousstringsalevellower

Deletion[edit]

Todeleteastringxfromatree,wefirstlocatetheleafrepresentingx.Then,assumingxexists,weremovethecorrespondingleafnode.Iftheparentofourleafnodehasonlyoneother
child,thenthatchild'sincominglabelisappendedtotheparent'sincominglabelandthechildisremoved.

AdditionalOperations[edit]

Findallstringswithcommonprefix:Returnsanarrayofstringswhichbeginwiththesameprefix.
Findpredecessor:Locatesthelargeststringlessthanagivenstring,bylexicographicorder.
Findsuccessor:Locatesthesmalleststringgreaterthanagivenstring,bylexicographicorder.

History[edit]

DonaldR.Morrisonfirstdescribedwhathecalled"Patriciatrees"in1968;[3]the
namecomesfromtheacronymPATRICIA,whichstandsfor"PracticalAlgorithmToRetrieveInformation
CodedInAlphanumeric".GernotGwehenbergerindependentlyinventedanddescribedthedatastructureataboutthesametime.[4]

Comparisontootherdatastructures[edit]

(Inthefollowingcomparisons,itisassumedthatthekeysareoflengthkandthedatastructurecontainsnmembers.)
Unlikebalancedtrees,
radixtreespermitlookup,insertion,anddeletioninO(k)timeratherthanO(logn).Thisdoesn'tseemlikeanadvantage,sincenormallyk≥logn,butinabalancedtreeeverycomparisonisastringcomparisonrequiring
O(k)worst-casetime,manyofwhichareslowinpracticeduetolongcommonprefixes(inthecasewherecomparisonsbeginatthestartofthestring).Inatrie,allcomparisonsrequireconstanttime,butittakesmcomparisonstolookup
astringoflengthm.Radixtreescanperformtheseoperationswithfewercomparisons,andrequiremanyfewernodes.
Radixtreesalsosharethedisadvantagesoftries,however:astheycanonlybeappliedtostringsofelementsorelementswithanefficientlyreversiblemappingtostrings,theylackthefull
generalityofbalancedsearchtrees,whichapplytoanydatatypewithatotalordering.
Areversiblemappingtostringscanbeusedtoproducetherequiredtotalorderingforbalancedsearchtrees,butnottheotherwayaround.Thiscanalsobeproblematicifadatatypeonlyprovidesa
comparisonoperation,butnota(de)serializationoperation.
Hashtablesarecommonlysaidtohaveexpected
O(1)insertionanddeletiontimes,butthisisonlytruewhenconsideringcomputationofthehashofthekeytobeaconstanttimeoperation.Whenhashingthekeyistakenintoaccount,hashtableshaveexpectedO(k)insertionanddeletiontimes,
butmaytakelongerintheworst-casedependingonhowcollisionsarehandled.Radixtreeshaveworst-caseO(k)insertionanddeletion.Thesuccessor/predecessoroperationsofradixtreesarealsonotimplementedbyhashtables.

Variants[edit]

Acommonextensionofradixtreesusestwocolorsofnodes,'black'and'white'.Tocheckifagivenstringisstoredinthetree,thesearchstartsfromthetopandfollowstheedgesofthe
inputstringuntilnofurtherprogresscanbemade.Ifthesearch-stringisconsumedandthefinalnodeisablacknode,thesearchhasfailed;ifitiswhite,thesearchhassucceeded.Thisenablesustoaddalargerangeofstringswithacommonprefixto
thetree,usingwhitenodes,thenremoveasmallsetof"exceptions"inaspace-efficientmannerbyinsertingthemusingblacknodes.
TheHAT-trieisaradixtreebased
cache-consciousdatastructurethatoffersefficientstringstorageandretrieval,andorderediterations.Performance,withrespecttobothtimeandspace,iscomparabletothecache-conscioushashtable.[5][6]See
HATtrieimplementationnotesat[1]

利用Radix树作为Key-Value
键值对的数据路由

引言：总所周知，NoSQL，Memcached等作为Key—Value存储的模型的数据路由都采用Hash表来达到目的。如何解决Hash冲突和Hash表大小的设计是一个很头疼的问题。
借助于Radix树，我们同样可以达到对于uint32_t的数据类型的路由。这个灵感就来自于Linux内核的IP路由表的设计。

作为传统的Hash表，我们把接口简化一下，可以抽象为这么几个接口。

?

接口的含义如其名，创建一个Hash表，插入，取得，删除。
同样，把这个接口的功能抽象后，利用radix同样可以实现相同的接口方式。

1intmc_radix_hash_ini(mc_radix_t*t,intnodenum)；
2
3intmc_radix_hash_insert(mc_radix_t*t,unsignedinthashvalue,void*data,size_tsize)；
4
5intmc_radix_hash_del(mc_radix_t*t,unsignedinthashvalue);
6
7void*mc_radix_hash_get(mc_radix_t*t,unsignedinthashvalue);

那我们简单介绍一下Radix树：
RadixTree(基树)其实就差不多是传统的二叉树，只是在寻找方式上，利用比如一个unsignedint的类型的每一个比特位作为树节点的判断。
可以这样说，比如一个数1000101010101010010101010010101010（随便写的）那么按照Radix树的插入就是在根节点，如果遇到0，就指向左节点，如果遇到1就指向右节点，在插入过程中构造树节点，在删除过程中删除树节点。如果觉得太多的调用Malloc的话，可以采用池化技术，预先分配多个节点，本博文就采用这种方式。

1typedefstruct_node_t
2{
3charzo;//zeroorone
4intused_num;
5struct_node_t*parent;
6struct_node_t*left;
7struct_node_t*right;
8void*data;//fornodesarraylistfindingnextemptynode
9intindex;
10}mc_radix_node_t;

节点的结构定义如上。
zo可以忽略，父节点，坐指针，右指针顾名思义，data用于保存数据的指针，index是作为node池的数组的下标。

树的结构定义如下：

1ypedefstruct_radix_t
2{
3mc_radix_nodes_array_t*nodes;
4mc_radix_node_t*root;
5
6mc_slab_t*slab;
7
8
9/*
10pthread_mutex_tlock;
11*/
12intmagic;
13inttotalnum;
14size_tpool_nodenum;
15
16mc_item_queuequeue;
17}mc_radix_t;

暂且不用看nodes的结构，这里只是作为一个node池的指针
root指针顾名思义是指向根结构，slab是作为存放数据时候的内存分配器，如果要使用内存管理来减少开销的话（参见slab内存分配器一章）
magic用来判断是否初始化，totalnum是叶节点个数，poll_nodenum是节点池内节点的个数。
queue是作为数据项中数据的队列。

我们采用8421编码的宏来作为每一个二进制位的判断：

?

#defineU31_MASK0x00000002

#defineU32_MASK0x00000001

　类似这样的方式来对每一位二进制位做判断，还有其他更好的办法，这里只是作为简化和快速。

?

　　
我们为Radix提供了一些静态函数，不对外声明：
初始化节点池

?

取得一个节点：

?

归还一个节点：

?

　这里是初始化radix树：

1intmc_radix_hash_ini(mc_radix_t*t,size_tnodenum)
2{
3/*initthenodepool*/
4t->nodes=(mc_radix_nodes_array_t*)malloc(sizeof(mc_radix_nodes_array_t));//为节点池分配空间
5t->slab=mc_slab_create();　　　　　　　　　　　　　　　　　　　　　　　　　　　　//使用slab分配器
6mc_radix_nodes_ini(t->nodes,nodenum);　　　　　　　　　　　　　　　　　　　　　　//初始化节点
7t->magic=MC_MAGIC;
8t->totalnum=0;
9t->pool_nodenum=nodenum;
10t->root=NULL;
11
12
13t->queue.head=NULL;
14t->queue.pear=NULL;
15t->queue.max_num=nodenum;
16t->queue.cur_num=0;
17}

1intmc_radix_hash_insert(mc_radix_t*t,unsignedinthashvalue,void*data,size_tsize)
2{
3unsignedinti=0;
4mc_radix_node_t*root=t->root;
5
6if(t->root==NULL)
7{
8t->root=mc_get_radix_node(t->nodes);
9}
10
11/*LRU*/
12/*其中涉及到LRU算法，原理是将所有的叶子节点链接为双向队列，然后更新和插入放入队列头，按照一定的比例从队列尾删除数据*/
13if(t->queue.cur_num>=(t->queue.max_num)*PERCENT)
14{
15for(i=0;i<(t->queue.max_num)*(1-PERCENT);i++)
16{
17mc_del_item(t,t->queue.pear);
18}
19}
20mc_radix_node_t*cur=t->root;
21for(i=0;i<32;i++)
22{
23/*1--->right*/
24　　　　/*按位来探测树节点*/
25if(hashvalue&MASKARRAY[i])
26{
27
28if(cur->right!=NULL)
29{
30cur->used_num++;
31cur->right->parent=cur;
32cur=cur->right;
33}
34else
35{
36cur->right=mc_get_radix_node(t->nodes);
37if(cur->right==NULL)
38{
39fprintf(stderr,"mc_get_radix_nodeerror\n");
40return-1;
41}
42cur->used_num++;
43cur->right->parent=cur;
44cur=cur->right;
45}
46}
47/*0--->left*/
48else
49{
50
51if(cur->left!=NULL)
52{
53cur->used_num++;
54cur->left->parent=cur;
55cur=cur->left;
56}
57else
58{
59cur->left=mc_get_radix_node(t->nodes);
60if(cur->left==NULL)
61{
62fprintf(stderr,"mc_get_radix_nodeerror\n");
63return-1;
64}
65
66cur->used_num++;
67cur->left->parent=cur;
68cur=cur->left;
69}
70}
71}
72
73t->totalnum++;
74mc_slot_t*l_slot=mc_slot_alloc(t->slab,size);
75cur->data=(mc_slot_t*)(cur->data);
76memcpy(l_slot->star,data,size);
77cur->data=l_slot;
78
79/*addtot->queue*/
80if(t->queue.head==NULL)
81{
82t->queue.head=cur;
83t->queue.pear=cur;
84cur->left=NULL;
85cur->right=NULL;
86
87t->queue.cur_num++;
88}
89else
90{
91cur->left=NULL;
92cur->right=t->queue.head;
93t->queue.head->left=cur;
94t->queue.head=cur;
95
96t->queue.cur_num++;
97}
98return1;
99}

删除一个节点,通过hashvalue作为其value,顾名思义

1intmc_radix_hash_del(mc_radix_t*t,unsignedinthashvalue)
2{
3if(t==NULL||t->root==NULL)
4{
5return-1;
6}
7/*noninitialized*/
8if(t->magic!=MC_MAGIC)
9{
10return-1;
11}
12mc_radix_node_t*cur=t->root;
13mc_radix_node_t*cur_par;
14inti=0;
15for(;i<32;i++)
16{
17if(hashvalue&MASKARRAY[i])
18{
19
20if(cur->right!=NULL)
21{
22cur->used_num--;
23cur=cur->right;
24}
25else
26return-1;
27}
28else
29{
30
31if(cur->left!=NULL)
32{
33cur->used_num--;
34cur=cur->left;
35}
36else
37return-1;
38}
39}
40
41if(cur->used_num>=0)
42mc_slot_free(cur->data);
43
44/*removefromt->queue*/
45if(cur==t->queue.pear&&cur==t->queue.head)
46{
47t->queue.pear=NULL;
48t->queue.head=NULL;
49t->queue.cur_num--;
50}
51/*thelastitem*/
52elseif(cur==t->queue.pear&&cur!=t->queue.head)
53{
54cur->left->right=NULL;
55cur->left=NULL;
56t->queue.cur_num--;
57}
58elseif(cur!=t->queue.pear)
59{
60cur->left->right=cur->right;
61cur->right->left=cur->left;
62t->queue.cur_num--;
63}
64else
65{
66cur->left->right=cur->right;
67cur->right->left=cur->left;
68t->queue.cur_num--;
69}
70
71for(;;)
72{
73
74if(cur->used_num==0)
75{
76cur_par=cur->parent;
77mc_free_radix_node(t->nodes,cur);
78cur=cur_par;
79}
80if(cur==NULL)
81break;
82if(cur->used_num>0)
83break;
84
85}
86
87return1;
88
89}

取得值：通过void*指向

1void*mc_radix_hash_get(mc_radix_t*t,unsignedinthashvalue)
2{
3if(t==NULL||t->root==NULL)
4{
5fprintf(stderr,"t==NULL||t->root==NULL\n");
6return(void*)(0);
7}
8/*noninitialized*/
9if(t->magic!=MC_MAGIC)
10{
11fprintf(stderr,"t->magic!=MC_MAGIC\n");
12return(void*)(0);
13}
14mc_radix_node_t*cur=t->root;
15mc_slot_t*ret_slot;
16inti=0;
17for(;i<32;i++)
18{
19if(hashvalue&MASKARRAY[i])
20{
21if(cur->right==NULL)
22break;
23else
24cur=cur->right;
25}
26else
27{
28if(cur->left==NULL)
29break;
30else
31cur=cur->left;
32}
33}
34if(i==32)
35{
36ret_slot=cur->data;
37
38/*updateLRUqueue*/
39if(cur->left!=NULL)
40{
41if(cur->right!=NULL)
42{
43cur->left->right=cur->right;
44cur->right->left=cur->left;
45cur->left=t->queue.head;
46t->queue.head->left=cur;
47t->queue.head=cur;
48}
49else
50{
51/*cur->right==NULLlastelementofLRUqueue*/
52cur->left->right=NULL;
53cur->left=t->queue.head;
54t->queue.head->left=cur;
55t->queue.head=cur;
56
57}
58}
59return(void*)(ret_slot->star);
60}
61else
62{
63fprintf(stderr,"i=%d\n",i);
64return(void*)(0);
65}
66}

1intmc_free_radix(mc_radix_t*t)
2{
3mc_free_all_radix_node(t->nodes);
4mc_slab_free(t->slab);
5free(t->nodes);
6}
7
8staticvoidmc_del_item(mc_radix_t*t,mc_radix_node_t*cur)
9{
10if(cur->left==NULL)
11{
12fprintf(stderr,"itemnumberinLRUqueueistoosmall\n");
13return;
14}
15if(cur->right!=NULL)
16{
17fprintf(stderr,"curshouldbethelastofLRUqueue\n");
18}
19/*removefromLRUqueue*/
20mc_radix_node_t*pcur=cur->left;
21cur->left=NULL;
22pcur->right=NULL;
23
24pcur=cur->parent;
25/*removefromradixtree*/
26while(pcur!=NULL)
27{
28cur->used_num--;
29if(cur->used_num<=0)
30{
31mc_free_radix_node(t->nodes,cur);
32}
33cur=pcur;
34pcur=pcur->parent;
35}
36
37}

总结：radix树作为key-value路由最大的好处就是在于减少了hash表的动态和一部分碰撞问题等。还可以在此结构上方便的扩展LRU算法，淘汰数据等。
如果担心node的初始化和申请太过于浪费资源，可以采用节点池的方式设计。

文章属原创，转载请注明出处联系作者：Email:zhangbo1@ijinshan.comQQ:51336447

Nginx源代码分析-radixtree

5人收藏此文章,我要收藏发表于4个月前(2013-03-03
23:05),已有204次阅读，共0个评论

本文分析基于Nginx-1.2.6，与旧版本或将来版本可能有些许出入，但应该差别不大，可做参考

radixtree是一种字典树，可以很得心应手地构建关联数组。在信息检索中可用于生成文档的倒排索引，另外，在IP路由选择中也有其特别的用处。

在Nginx中实现了radixtree，其主要用在GEO模块中，这个模块中只有一个指令即geo，通过这个指令可以定义变量，而变量的值依赖于客户端的IP地址（默认使用($remote_addr，但也可设定为其他变量），通过这个模块可以实现负载均衡，对不同区段的用户请求使用不同的后端服务器。一个例子：

geo$country{
defaultno;
127.0.0.0/24us;#/之前为IP地址address，/之后是地址掩码mask
127.0.0.1/32ru;
10.1.0.0/16ru;
192.168.1.0/24uk;#当ip地址为192.168.1.23时，变量country的值为uk
}

nginx在解析上面这段配置时，会构建一个数据结构，并在接受请求后根据客户端IP地址查找对应的变量值，这个数据结构就是radixtree，它是一棵二叉树，其结构图如下所示，每条边对应1bit是0或1。![radixtree][1]

01	typedef struct ngx_radix_node_s ngx_radix_node_t;

03	struct ngx_radix_node_s {

04	ngx_radix_node_t *right;

05	ngx_radix_node_t *left;

06	ngx_radix_node_t *parent;

07	uintptr_t value;

};

10	typedef struct {

11	ngx_radix_node_t *root;

12	ngx_pool_t *pool;

13	ngx_radix_node_t * free ;

14	char *start;

15	size_t size;

16	} ngx_radix_tree_t;

为避免频繁地为ngx_radix_node_t分配和释放空间，实现节点的复用，ngx_radix32tree_delete删除节点后并没有释放空间，而是利用ngx_radix_tree_t中的成员free把删除的节点连接成了一个单链表结构，在调用ngx_radix_alloc创建新节点时就先看free右孩子指针所指向的链表是否为空，如果不为空，就从中取出一个节点返回其地址。另外，为radixtree分配空间是以Page为单位的，start指向Page中可用内存的起始位置，size是page中剩余可用的空间大小。

radixtree的创建、插入一节点、删除一节点、查找这四个操作的函数声明如下：

1	ngx_radix_tree_t ngx_radix_tree_create(ngx_pool_tpool,

2	ngx_int_t preallocate);

3	ngx_int_t ngx_radix32tree_insert(ngx_radix_tree_t*tree,

4	uint32_t key,uint32_tmask, uintptr_t value);

5	ngx_int_t ngx_radix32tree_delete(ngx_radix_tree_t*tree,

6	uint32_t key,uint32_tmask);

7	uintptr_t ngx_radix32tree_find(ngx_radix_tree_t *tree,uint32_tkey);

插入节点

geo指令中的“192.168.1.0/24ru;”这样一条配置就对应了radixtree中的一个节点，那程序中是如何实现的呢？首先看函数ngx_radix32tree_insert中的参数，key是对应inaddrt类型的ip地址转换成主机字节序后的四个字节，mask即网络掩码，对应于24的是0xFFFFFF00四个字节，value是对应ru的一个ngx_http_variable_value_t类型的指针。

将value插入那个位置呢？从key&mask的最高位开始，若是0，则转向左孩子节点，否则转向右孩子节点，以此类推沿着树的根节点找到要插入的位置（对应上面例子的要插入的节点在第24层）。若到了叶子节点仍没到达最终位置，那么在叶子节点和最终位置之间空缺的位置上插入value=NGX_RADIX_NO_VALUE的节点。如果对应位置已经有值，返回NGX_BUSY，否则设置对应的value，返回NGX_OK。

创建

为radixtree树结构及其root节点分配空间，并根据preallocate的值向树中插入一定数量的节点，当preallocate等于-1时，会重新为preallocate设置适当的值，不同平台下会插入不同数量的节点。

preallocate的具体含义是，在树中插入第1层到第preallocate层所有的节点，即创建树之后树中共有2^(preallocate+1)-1个节点。那么，当preallocate=-1时，应该为不同的平台设定怎样的值呢？这是由num=ngx_pagesize/sizeof(ngx_radix_node_t)决定的，当为num=128时，preallocate=6，这是因为预先插入节点生成的树是完全二叉树，树的第6层节点都插满时，树共有127个节点占用正好不大于1页内存的空间，增加preallocate继续预先插入节点就会得不偿失。这里我也说不太清楚，贴上注释：

/*

02	* Preallocationoffirstnodes:0,1,00,01,10,11,000,001,etc.

03	* increasesTLBhitsevenifforfirstlookupiterations.

04	* On32-bitplatformsthe7preallocatedbitstakescontinuous4K,

05	* 8-8K,9-16K,etc.On64-bitplatformsthe6preallocatedbits

06	* takescontinuous4K,7-8K,8-16K,etc.Thereisnosenseto

07	* topreallocatemorethanonepage,becausefurtherpreallocation

08	* distributestheonlybitperpage.Instead,arandominsertion

09	* maydistributeseveralbitsperpage.

11	* Thus,bydefaultwepreallocatemaximum

12	* 6bitsonamd64(64-bitplatformand4Kpages)

13	* 7bitsoni386(32-bitplatformand4Kpages)

14	* 7bitsonsparc64in64-bitmode(8Kpages)

15	* 8bitsonsparc64in32-bitmode(8Kpages)

*/

查找

现在给定一个ip，应该在radixtree中怎样找到对应的变量值呢？首先将ip地址转换成主机字节序的四个字节，然后调用uintptr_tngx_radix32tree_find即可，在这个函数中，会将从32位的key的最高位开始，若是0，就转向左孩子，若是1，就转向右孩子，这样从树的根节点开始，直到找到对应的叶子节点为止，在此查找路径上最后一个值不为NGX_RADIX_NO_VALUE的node的value就是所返回的值。代码如下：

uintptr_t

02	ngx_radix32tree_find(ngx_radix_tree_t *tree,uint32_tkey)

04	uint32_t bit;

05	uintptr_t value;

06	ngx_radix_node_t *node;

08	bit =0x80000000;

09	value =NGX_RADIX_NO_VALUE;

10	node =tree->root;

12	while (node) {

13	if (node->value !=NGX_RADIX_NO_VALUE){

14	value =node->value;

17	if (key &bit){

18	node =node->right;

else

21	node =node->left;

bit
>>=1;

27	return value;

删除节点

删除过程，首先要先找到要删除的节点，其过程同插入一节点时相同，如果找不到，返回NGX_ERROR，否则就分两种情况：

如果要删除的节点是叶子节点，那么将此节点删除，并插入到free右孩子指针所指向的链表中，留在以后复用，如果删除之后，其父节点成了叶子节点且其值为NGX_RADIX_NO_VALUE，那么也将其父节点执行同样的删除操作，以此类推直到根节点为止；

如果要删除的节点有至少一个孩子，并且这个要删除的节点的值不是NGX_RADIX_NO_VALUE，则只需设定其值为NGX_RADIX_NO_VALUE即可，这样子处理，减少了删除操作的复杂度，这个节点也只有等遇到第一种情况时才会真正地从树中删除。

hash_mapvsradixtree

August30th,2011绚丽也尘埃Leave
acommentGo
tocomments

最近看代码看到有一个radixtree的应用。引擎对数据建索引时，需要建立字段名到字段序号的映射表，这个表使用非常频繁。比如有6亿document，每个document有100个字段，很多字段字段会同时建index，profile和detail索引，所以需要在表中查找三遍，因此至少需要查找600亿次。如果能提高这些查找的效率，程序的整体效率会得到提高。

写了个小程序对比了下hash_map（Linux平台下的实现）和radixtree的效率。理论上来讲radixtree效率会提高不少，查找一个字符串需要O(n)，hash_map需要对字符串求hash值，至少要将字符串遍历一遍，另外还要有一些多余的加减乘除。另外一个影响因素是用C写的代码比较紧凑，使用inline声明比较容易被内联，而hash_map必须使用一个hash函数对象，其本身的代码也比较复杂不容易被内联。radixtree的主要缺点是每个节点的指针数组如果做成动态分配，代码写起来会比较麻烦。析构一个radix
tree也比较麻烦。Linux内核也用到了radixtree，没看过代码，应该做的很精致吧。

下面这个例子程序在一台8核8GB内存的RHEL4服务器上运行，hash_map和radix_tree插入相同的16个节点，然后查找1亿次。O2优化后，hash_map运行25s左右，radixtree则只要2s左右，效率提升非常明显。顺便比较了下map，map的查找性能是最差的，要34s。如果要查找600亿次，hash_map需要240分钟，如果分到20个机器上，每个机器起6个线程上，每个线程要花上将近2分钟。可以考虑用oprofile来统计下现在建一次索引花在radixtree查找上的时间。例子程序代码如下。

?

RadixTree算法

PostedbychenyajuninNginx,数据结构与算法|

Nginx中有一个模块：geo，它可以针对不同的IP地址来定义不同的变量值，其中就用到了radixtree和red-blacktree。

RadixTree

实质就是trie数组的一种变体，但是不同的是其中的边不像trie那样只存放一个字符，而是可以存放多个字符。这很有利于路径的压缩，可以有效减小树的深度。radixtree已经被应用在bsd的路由查找和linux内核之中。

算法实现

维基百科上的文章很清楚描述了radixtree大致是怎么一回事。

复杂度

Linux基数树（radixtree）是将指针与long整数键值相关联的机制，它存储有效率，并且可快速查询，用于指针与整数值的映射（如：IDR机制）、内存管理等。

IDR（IDRadix）机制是将对象的身份鉴别号整数值ID与对象指针建立关联表，完成从ID与指针之间的相互转换。IDR机制使用radix树状结构作为由id进行索引获取指针的稀疏数组，通过使用位图可以快速分配新的ID，IDR机制避免了使用固定尺寸的数组存放指针。IDR机制的API函数在lib/idr.c中实现，这里不加分析。

Linuxradix树最广泛的用途是用于内存管理，结构address_space通过radix树跟踪绑定到地址映射上的核心页，该radix树允许内存管理代码快速查找标识为dirty或writeback的页。Linuxradix树的API函数在lib/radix-tree.c中实现。
radix树概述

radix树是通用的字典类型数据结构，radix树又称为PAT位树（PatriciaTrieorcritbittree）。Linux内核使用了数据类型unsignedlong的固定长度输入的版本。每级代表了输入空间固定位数。

radixtree是一种多叉搜索树，树的叶子结点是实际的数据条目。每个结点有一个固定的、2^n指针指向子结点（每个指针称为槽slot），并有一个指针指向父结点。

Linux内核利用radix树在文件内偏移快速定位文件缓存页，图4是一个radix树样例，该radix树的分叉为4(22)，树高为4，树的每个叶子结点用来快速定位8位文件内偏移，可以定位4x4x4x4=256页，如：图中虚线对应的两个叶子结点的路径组成值0x00000010和0x11111010，指向文件内相应偏移所对应的缓存页。

图4一个四叉radix树

Linuxradix树每个结点有64个slot，与数据类型long的位数相同，图1显示了一个有3级结点的radix树，每个数据条目（item）可用3个6位的键值（key）进行索引，键值从左到右分别代表第1~3层结点位置。没有孩子的结点在图中不出现。因此，radix树为稀疏树提供了有效的存储，代替固定尺寸数组提供了键值到指针的快速查找。

图1一个3级结点的radix树及其键值表示
radix树slot数

Linux内核根用户配置将树的slot数定义为4或6，即每个结点有16或64个slot，如图2所示，当树高为1时，64个slot对应64个页，当树高为2时，对应64*64个页。

图2高为1和2、slot数为64的radix树

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航