05-树9 Huffman Codes
2016-04-21 16:02
330 查看
05-树9 Huffman Codes (30分)
In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Redundancy Codes", and hence printed his name in the history of computer science. As a professor who gives the final exam problem
on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string "aaaxuaxz", we can observe that the frequencies of the characters 'a', 'x', 'u' and 'z' are 4, 2, 1 and 1, respectively. We may either encode the
symbols as {'a'=0, 'x'=10, 'u'=110, 'z'=111}, or in another way as {'a'=1, 'x'=01, 'u'=001, 'z'=000}, both compress the string into 14 bits. Another set of code can be given as {'a'=0, 'x'=11, 'u'=100, 'z'=101}, but {'a'=0, 'x'=01, 'u'=011, 'z'=001} is NOT
correct since "aaaxuaxz" and "aazuaxax" can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not.
Each input file contains one test case. For each case, the first line gives an integer NN (2\le
N\le 632≤N≤63),
then followed by a line that contains all the NNdistinct
characters and their frequencies in the following format:
where
a character chosen from {'0' - '9', 'a' - 'z', 'A' - 'Z', '_'}, and
the frequency of
an integer no more than 1000. The next line gives a positive integer MM (\le
1000≤1000),
then followed by MMstudent
submissions. Each student submission consists of NN lines,
each in the format:
where
the
an non-empty string of no more than 63 '0's and '1's.
For each test case, print in each line either "Yes" if the student's submission is correct, or "No" if not.
Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct.
题目思路:
就是何老师讲的霍夫曼树,关键是前缀码的判断,这点参考了inaho的思路
In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Redundancy Codes", and hence printed his name in the history of computer science. As a professor who gives the final exam problem
on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string "aaaxuaxz", we can observe that the frequencies of the characters 'a', 'x', 'u' and 'z' are 4, 2, 1 and 1, respectively. We may either encode the
symbols as {'a'=0, 'x'=10, 'u'=110, 'z'=111}, or in another way as {'a'=1, 'x'=01, 'u'=001, 'z'=000}, both compress the string into 14 bits. Another set of code can be given as {'a'=0, 'x'=11, 'u'=100, 'z'=101}, but {'a'=0, 'x'=01, 'u'=011, 'z'=001} is NOT
correct since "aaaxuaxz" and "aazuaxax" can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not.
Input Specification:
Each input file contains one test case. For each case, the first line gives an integer NN (2\leN\le 632≤N≤63),
then followed by a line that contains all the NNdistinct
characters and their frequencies in the following format:
c[1] f[1] c[2] f[2] ... c f
where
c[i]is
a character chosen from {'0' - '9', 'a' - 'z', 'A' - 'Z', '_'}, and
f[i]is
the frequency of
c[i]and is
an integer no more than 1000. The next line gives a positive integer MM (\le
1000≤1000),
then followed by MMstudent
submissions. Each student submission consists of NN lines,
each in the format:
c[i] code[i]
where
c[i]is
the
i-th character and
code[i]is
an non-empty string of no more than 63 '0's and '1's.
Output Specification:
For each test case, print in each line either "Yes" if the student's submission is correct, or "No" if not.Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct.
Sample Input:
7 A 1 B 1 C 1 D 3 E 3 F 6 G 6 4 A 00000 B 00001 C 0001 D 001 E 01 F 10 G 11 A 01010 B 01011 C 0100 D 011 E 10 F 11 G 00 A 000 B 001 C 010 D 011 E 100 F 101 G 110 A 00000 B 00001 C 0001 D 001 E 00 F 10 G 11
Sample Output:
Yes Yes No No
题目思路:
就是何老师讲的霍夫曼树,关键是前缀码的判断,这点参考了inaho的思路
#include<stdio.h> #include <stdlib.h> #define MinData 0 typedef struct TreeNode *HuffmanTree; struct TreeNode{ int weight; HuffmanTree left; HuffmanTree right; }; //霍夫曼树 typedef struct HeapStruct *MinHeap; struct HeapStruct{ HuffmanTree elements; int size; int capacity; }; //最小堆 MinHeap BuildMinHeap(int Maxsize,int weight[]); MinHeap CreateMinHeap (int Maxsize); void Insert(MinHeap H,HuffmanTree HT); HuffmanTree DeleteMin(MinHeap H); HuffmanTree Huffman(MinHeap H); void GetWpl(HuffmanTree HT,int layer,int *wpl); int code_length(char *a); int compare(char *c1, char *c2); int main(int argc, char const *argv[]) { //freopen("test.txt","r",stdin); int N; scanf("%d\n",&N);//一共几个编码 char c ;//记录字符 int f ;//记录字符对应权重 for(int i = 0; i < N; i++) { if(i == N - 1) { scanf("%c %d", &c[i], &f[i]); } else{ scanf("%c %d ", &c[i], &f[i]); } } MinHeap MH = BuildMinHeap(N,f); HuffmanTree HT = Huffman(MH); int minwpl = 0; GetWpl(HT,0,&minwpl); //读入有多少个编码组合 int M; scanf("%d\n",&M); char ch ,code [64]; for(int j = 0; j < M; j++)//一共M组 { for(int i = 0; i < N; i++)///每组N个数 { scanf("%c %s\n",&ch[i],&code[i]); } int flag = 1;//1表示是前缀码//前缀码判断 for (int i = 0; i < N; i++) { for (int k = i+1; k < N; ++k) { if(compare(code[i], code[k])) { if(flag) printf("No\n"); flag = 0; } } } int stu_wpl = 0; //比较wpl if( flag ) { for (int i = 0; i < N; ++i) stu_wpl += f[i]*code_length(code[i]); if(minwpl == stu_wpl) printf("Yes\n"); else printf("No\n"); } } return 0; } int code_length(char *a) { int len = 0; char *p = a; while(*p != '\0'){ p++; len++; } return len; } int compare(char *c1, char *c2) { char *a = c1, *b = c2; while(*a!='\0' && *b!='\0'){ if(*a != *b) return 0; a++; b++; } return 1; } MinHeap BuildMinHeap(int Maxsize,int weight[]) { int i; MinHeap H = CreateMinHeap(Maxsize); HuffmanTree temp = (HuffmanTree)malloc(sizeof(struct TreeNode)); for(int i = 0; i < Maxsize; i++) { temp->weight = weight[i]; temp->left = temp->right = NULL; Insert(H,temp); } free(temp); return H; } MinHeap CreateMinHeap (int Maxsize) { MinHeap H = (MinHeap)malloc(sizeof(struct HeapStruct)); H->elements = (HuffmanTree)malloc(sizeof(struct TreeNode)*(Maxsize+1)); //因为Elemens[0]作为哨兵,从[1]开始存放,所以分配MaxSize+1空间 H->size = 0; H->capacity = Maxsize; H->elements[0].weight= MinData;//哨兵 return H; } void Insert(MinHeap H,HuffmanTree HT) { int i; //将X插入H if(H->size == H->capacity){ printf("最大堆已满!"); return; } i = ++(H->size); //i指向插入后堆中的最后一个元素的位置(该结点此时为空结点) for( ; H->elements[i/2].weight > HT->weight ; i/=2) //堆从1开始 //不断和父节点比较,父节点大,就往下走 { H->elements[i].weight = H->elements[i/2].weight; } H->elements[i] = *HT;//将X插入; } HuffmanTree DeleteMin(MinHeap H) { //从最小堆H中取出键值为最小的元素,并删除一个结点 int parent, child; HuffmanTree MinItem, temp; MinItem = (HuffmanTree)malloc(sizeof(struct TreeNode)); temp = (HuffmanTree)malloc(sizeof(struct TreeNode)); if ( H->size == 0) { printf("最小堆已为空"); return NULL; } *MinItem = H->elements[1];//保存最小的元素 *temp = H->elements[H->size--];//从最后一个元素插到顶点来比较 for(parent=1;parent*2 <= H->size; parent = child)//有没有左儿子 { child = parent * 2;//有的话比较左儿子 if ((child != H->size)&& (H->elements[child].weight > H->elements[child+1].weight)) //比较左右儿子哪个小 { child ++; } if(temp->weight <= H->elements[child].weight) { break; } else { H->elements[parent] = H->elements[child]; } } H->elements[parent] = *temp; free(temp); return MinItem; } HuffmanTree Huffman(MinHeap H) { int i; HuffmanTree HT; int times = H->size;//H->size的值会发生变化,所以要用另一个变量来存储 for( i = 1; i < times ; i++) { HT = (HuffmanTree)malloc(sizeof(struct TreeNode)); HT->left = DeleteMin(H); HT->right = DeleteMin(H); HT->weight = HT->left->weight + HT->right->weight; Insert(H,HT); } HT = DeleteMin(H); return HT; } void GetWpl(HuffmanTree HT,int layer,int *wpl) { if(HT->left== NULL && HT->right == NULL) { (*wpl) = (*wpl) + layer * HT->weight; } else//不是叶节点 { GetWpl(HT->left,layer+1,wpl); GetWpl(HT->right,layer+1,wpl); } }
相关文章推荐
- mysql查询最大值max()对应的记录值。
- time元素
- J-Children of the Candy Corn|BFS+DFS
- iOS tableview cell 多选 (批量邀请好友)
- 【小笨鸟看JDK1.7集合源码之三】LinkedList源码剖析
- 原子性与可见性
- Android五子棋游戏程序完整实例分析
- List的add方法剖析
- 第一个AngularJS Web应用 todoList
- Linux iptables防火墙实用模板
- cocos2dx 屏幕自适应
- javascript基础五 (cookie基础)
- 动态、静态语言,强、弱类型语言
- 线程返回值以及线程锁
- 图形图像库集合
- Android studio:Groovy 与 Gradle 基础【三】
- 你的人生许多痛苦源于盲目较劲
- 在Servlet使用getServletContext()获取ServletContext对象出现java.lang.NullPointerException(空指针)异常的解决办法
- GreenSock
- 理解C语言的数组和指针