您的位置:首页 > 其它

POJ 1200 Crazy Search(哈希算法)【模板】

2017-08-17 20:01 405 查看
Many people like to solve hard puzzles some of which may lead them to madness. One such puzzle could be finding a hidden prime number in a given text. Such number could be the number of different substrings of a given size that
exist in the text. As you soon will discover, you really need the help of a computer and a good algorithm to solve such a puzzle. 

Your task is to write a program that given the size, N, of the substring, the number of different characters that may occur in the text, NC, and the text itself, determines the number of different substrings of size N that appear in the text. 

As an example, consider N=3, NC=4 and the text "daababac". The different substrings of size 3 that can be found in this text are: "daa"; "aab"; "aba"; "bab"; "bac". Therefore, the answer should be 5. 

Input
The first line of input consists of two numbers, N and NC, separated by exactly one space. This is followed by the text where the search takes place. You may assume that the maximum number of substrings formed by the possible set
of characters does not exceed 16 Millions.

Output
The program should output just an integer corresponding to the number of different substrings of size N found in the given text.

Sample Input
3 4
daababac


Sample Output
5


Hint
Huge input,scanf is recommended.

 【题解】

 题意很简单,就是给定一个长最多为16000000的字符串,其中字符类型有m种,问其中长度为n的相异子串的个数是多少。

 分析:

首先注意到,数据量很大,虽然网上说12000000的数组也可以过,但是那也有1e8的数据,普通的方法过不了,所以必须想其他算法,一开始我用的是map 键值对来做,但是很不幸,也超时了(稍后会附有代码,这也是一种方法嘛),最后思考良久,用hash试了试,果然过了,还很快,只有63ms,不得不说hash算法很强,具体就是,把原串中的每个字符给它赋值,用数字来代替不同的字母,比如a可以用0表示,b可以用1表示,等等。

然后再遍历长度为n的子串,把每个子串用刚才赋值的数字按10进制或者m进制转化成一个数(其实就是把长度为n的那一小段字符表示成一个数),可以想象,只要子串不同,那表示出来的数字结果就一定不相同,这就把字符串和数字构成了一一对应关系,进而也就能用不同的数字表示不同的子串,最后只要遍历一下不同的数字有多少,就是答案了。

【键值对做法——TLE代码】

#include<iostream>
#include<map>
#include<string>
using namespace std;
map<string,int> Map;
string strText;
int N;
void Hash()
{
int i;
Map.clear();
for(i=0;i<(int)strText.size()-N+1;++i)
{
string Temp(strText,i,N);
Map[Temp]=i;//赋值操作只有占位的功能
}
cout<<Map.size()<<endl;
}

int main()
{
int T,NC;
cin>>T;
while(T--)
{
cin>>N>>NC;
cin>>strText;
Hash();
}
return 0;
}


【AC代码】

#include<iostream>
#include<cstdio>
#include<cstring>
#include<algorithm>
using namespace std;
const int N=16000005;
int m,n;
char str
;
int hash
;
int vis[500];

int main()
{
while(~scanf("%d%d%s",&m,&n,str))
{
int num=0;
int len=strlen(str);
vis[0]=num++; //第一个字符编号为0
for(int i=1;i<len;++i)//遍历所有的字符串
{
if(vis[str[i]]==0)//如果没出现过
vis[str[i]]=num++;;//就给它编号
}
int ans=0;
for(int i=0;i<=len-m;++i)//遍历长度为m的子串
{
int sum=0;
for(int j=0;j<m;++j)
{
sum=sum*num+vis[str[i+j]];//字符串转化为数字
}
if(!hash[sum])//第一次出现该字符串
{
hash[sum]=1;
ans++;
}
}
printf("%d\n",ans);
}
return 0;
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: