POJ 1200--哈希,hash,karp-rabin,离散化(快来复习)
2014-08-05 16:51
323 查看
Crazy Search
Description
Many people like to solve hard puzzles some of which may lead them to madness. One such puzzle could be finding a hidden prime number in a given text. Such number could be the number of different substrings of a given size that exist in the text. As you soon
will discover, you really need the help of a computer and a good algorithm to solve such a puzzle.
Your task is to write a program that given the size, N, of the substring, the number of different characters that may occur in the text, NC, and the text itself, determines the number of different substrings of size N that appear in the text.
As an example, consider N=3, NC=4 and the text "daababac". The different substrings of size 3 that can be found in this text are: "daa"; "aab"; "aba"; "bab"; "bac". Therefore, the answer should be 5.
Input
The first line of input consists of two numbers, N and NC, separated by exactly one space. This is followed by the text where the search takes place. You may assume that the maximum number of substrings formed by the possible set of characters does not exceed
16 Millions.
Output
The program should output just an integer corresponding to the number of different substrings of size N found in the given text.
Sample Input
Sample Output
Hint
Huge input,scanf is recommended.
Source
Southwestern Europe 2002
题目大意:
给一个字符串,求不同子串个数。再给两个整数n和nc,
其中n代表要求的子串的长度,nc代表字符串中出现的字母的个数
解题思路:
(1)离散化:因为母串中出现的不同字母的个数是一定的,且可以小于26
所以可以先离散化,减少空间时间占用
(2)哈希:
这里采用karp-rabin 哈希函数,谈谈自己对这个方法的理解。
大概的思路如上图,但不绝对,可以正可以反。、
但是原理是:
上图的Hash(i,L)的值按照x进制,一定会是一个L位数。
也就是说,对于每一个长度为L的子串,都会对应一个L位的x进制数。
其中L是长度,x由每一位的出现的字符的可能数直接决定(相等)。
下面是AC代码:
Time Limit: 1000MS | Memory Limit: 65536K | |
Total Submissions: 22715 | Accepted: 6379 |
Many people like to solve hard puzzles some of which may lead them to madness. One such puzzle could be finding a hidden prime number in a given text. Such number could be the number of different substrings of a given size that exist in the text. As you soon
will discover, you really need the help of a computer and a good algorithm to solve such a puzzle.
Your task is to write a program that given the size, N, of the substring, the number of different characters that may occur in the text, NC, and the text itself, determines the number of different substrings of size N that appear in the text.
As an example, consider N=3, NC=4 and the text "daababac". The different substrings of size 3 that can be found in this text are: "daa"; "aab"; "aba"; "bab"; "bac". Therefore, the answer should be 5.
Input
The first line of input consists of two numbers, N and NC, separated by exactly one space. This is followed by the text where the search takes place. You may assume that the maximum number of substrings formed by the possible set of characters does not exceed
16 Millions.
Output
The program should output just an integer corresponding to the number of different substrings of size N found in the given text.
Sample Input
3 4 daababac
Sample Output
5
Hint
Huge input,scanf is recommended.
Source
Southwestern Europe 2002
题目大意:
给一个字符串,求不同子串个数。再给两个整数n和nc,
其中n代表要求的子串的长度,nc代表字符串中出现的字母的个数
解题思路:
(1)离散化:因为母串中出现的不同字母的个数是一定的,且可以小于26
所以可以先离散化,减少空间时间占用
scanf("%s",a); //读取字符串 int sz=strlen(a),t=0; for(int i=0;i<sz;i+=1){ //离散化 if(name[a[i]-'a']==-1){ //给出现过的字符编号,可以直接使用a[i]-'a'作为下标索引 name[a[i]-'a']=t++; } }
(2)哈希:
这里采用karp-rabin 哈希函数,谈谈自己对这个方法的理解。
大概的思路如上图,但不绝对,可以正可以反。、
但是原理是:
上图的Hash(i,L)的值按照x进制,一定会是一个L位数。
也就是说,对于每一个长度为L的子串,都会对应一个L位的x进制数。
其中L是长度,x由每一位的出现的字符的可能数直接决定(相等)。
下面是AC代码:
#include <iostream>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string>
#include <vector>
#include <list>
#include <map>
#include <queue>
#include <stack>
#include <bitset>
#include <algorithm>
#include <numeric>
#include <functional>
#define maxn 16000005
#define mod 100000007
#define cons 2
using namespace std;
typedef __int64 ll;
char a[maxn];
char name[30];
bool hashv[maxn];
int main()
{
int n,nc;
while(scanf("%d %d",&n,&nc)!=EOF){
memset(name,-1,sizeof(name));
memset(hashv,false,sizeof(hashv));
getchar();
scanf("%s",a); //读取字符串 int sz=strlen(a),t=0; for(int i=0;i<sz;i+=1){ //离散化 if(name[a[i]-'a']==-1){ //给出现过的字符编号,可以直接使用a[i]-'a'作为下标索引 name[a[i]-'a']=t++; } }
int tmp=0;
t=nc;
for(int i=0;i<n-1;i+=1){ //先计算前n-1个字符对应的数值
tmp=tmp*nc+name[a[i]-'a'];
t*=nc;
}
int countt=0;
for(int i=n-1;i<sz;i+=1){
tmp=(tmp*nc+name[a[i]-'a'])%t;
//这句话等效于tmp=tmp*nc-name[a[i-n]-'a']*t+name[a[i]-'a'];
if(!hashv[tmp]){
hashv[tmp]=true;
countt+=1;
/*string str(a,i-n+1,n);
cout<<str<<'\n';*/
}
}
printf("%d\n",countt);
}
return 0;
}
相关文章推荐
- hash&Rabin-Karp字符串查找POJ 1200 Crazy Search
- 字符串哈希之Rabin-Karp,poj1200
- poj 1200 Hash处理字符串(简单的rabin-karp)
- Rabin-Karp字符串查找算法学习:poj1200
- poj 1200(字符串hash)
- POJ 1200 Hash
- 一道有关hash的POJ题目:POJ1200 Crazy Search
- 暑期个人赛--第十一场--B(字符串哈希 Karp-Rabin)
- poj 1200 Crazy Search(字符串hash)
- poj 1200:Crazy Search (Hash)
- POJ 1200 Hash
- poj1200-CrazySearch(Rabin-Karp Hash)
- 哈希(hash) 之 开放地址法(poj)
- poj1200_经典hash
- POJ 1200 Hash
- POJ 1200 Crazy Search(哈希)
- POJ - 1200Crazy Search(哈希|hash)
- [POJ] 1200 Crazy Search [HASH]
- hash poj_1200 Crazy Search
- Crazy Search - POJ 1200 哈希