您的位置:首页 > 编程语言 > Python开发

SICP_Python版本:Hufuman编码

2016-07-29 21:40 363 查看

采用递归定义来构造hufuman树,且进行编码解码。

def make_leaf(symbol,weight):
return ('leaf',symbol,weight)
def is_leaf(obj):
return obj[0]=='leaf'
def weight_ofleaf(x):return x[2]
def symbol_ofleaf(x):return x[1]
def make_tree(left,right):
return ('node',left,right,weight(left)+weight(right),symbols(left)+symbols(right))
def lefttree(t):return t[1]
def righttree(t):return t[2]
def symbols(t):return symbol_ofleaf(t) if is_leaf(t) else t[4]
def weight(t): return weight_ofleaf(t) if is_leaf(t) else t[3]
def dec(bits,tree):
def decode_one():
nonlocal bits,tree,i
node = tree
while not is_leaf(node):
if bits[i]=='0':node = lefttree(node)
elif bits[i]=='1':node = righttree(node)
else:assert(bits[i]=='0' or bits[i]=='1')
i+=1
return symbol_ofleaf(node)
i,n = 0,len(bits)
res = []
while i<n:res.append(decode_one())
return res
def create_set(l):
u = sorted(l,key = lambda x:x[1])
return set([make_leaf(e[0],e[1]) for e in u])
sample_tree = make_tree(make_leaf('A',4),make_tree(make_leaf('B',2),make_tree(make_leaf('D',1),make_leaf('C',1))))
#print(dec('0110010101110',sample_tree))
def encode(message,tree):
def encode_symbol(symbol,tree):
if is_leaf(tree):return ''
if symbol in symbols(lefttree(tree)):return '0'+encode_symbol(symbol,lefttree(tree))
else:return '1'+encode_symbol(symbol,righttree(tree))

if len(message)==0:return ''
return encode_symbol(message[0],tree)+encode(message[1:],tree)


2.71很容易判断出来每次最小两个子树的权重和还是当前最小的,不改变原本的顺序。所以最频繁的用1个二进制,最不频繁的用n-1个二进制。

2.72这里需要稍微分情况看待问题:

(a):如果符号表利用了hash表,那么最频繁的编码复杂度始终是O(1),而最不频繁的是O(n)

(b):如果直接查找符号,那么最不频繁的应该是O(N),当然你可以构造树时总将权重较大的放在同一个位置(left or right),这样也就是O(1).如果随即放两个子树的位置,那么应该是O(n).

对于最频繁的来说T(N)=O(N)+O(N−1)+....O(1)=O(N2)

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: