您的位置：首页 > 其它

POJ 1200--哈希，hash，karp-rabin，离散化（快来复习）

2014-08-05 16:51 323 查看

Crazy Search

Time Limit: 1000MS		Memory Limit: 65536K
Total Submissions: 22715		Accepted: 6379

Description

Many people like to solve hard puzzles some of which may lead them to madness. One such puzzle could be finding a hidden prime number in a given text. Such number could be the number of different substrings of a given size that exist in the text. As you soon
will discover, you really need the help of a computer and a good algorithm to solve such a puzzle.

Your task is to write a program that given the size, N, of the substring, the number of different characters that may occur in the text, NC, and the text itself, determines the number of different substrings of size N that appear in the text.

As an example, consider N=3, NC=4 and the text "daababac". The different substrings of size 3 that can be found in this text are: "daa"; "aab"; "aba"; "bab"; "bac". Therefore, the answer should be 5.

Input

The first line of input consists of two numbers, N and NC, separated by exactly one space. This is followed by the text where the search takes place. You may assume that the maximum number of substrings formed by the possible set of characters does not exceed
16 Millions.
Output

The program should output just an integer corresponding to the number of different substrings of size N found in the given text.
Sample Input

3 4
daababac

Sample Output

Hint

Huge input,scanf is recommended.
Source
Southwestern Europe 2002

题目大意：
给一个字符串，求不同子串个数。再给两个整数n和nc，
其中n代表要求的子串的长度，nc代表字符串中出现的字母的个数

解题思路：
（1）离散化：因为母串中出现的不同字母的个数是一定的，且可以小于26
所以可以先离散化，减少空间时间占用

scanf("%s",a);   //读取字符串
int sz=strlen(a),t=0;
for(int i=0;i<sz;i+=1){    //离散化
if(name[a[i]-'a']==-1){
//给出现过的字符编号，可以直接使用a[i]-'a'作为下标索引
name[a[i]-'a']=t++;
}
}

（2）哈希：
这里采用karp-rabin 哈希函数，谈谈自己对这个方法的理解。

大概的思路如上图，但不绝对，可以正可以反。、

但是原理是：
上图的Hash（i,L）的值按照x进制，一定会是一个L位数。
也就是说，对于每一个长度为L的子串，都会对应一个L位的x进制数。
其中L是长度，x由每一位的出现的字符的可能数直接决定（相等）。

下面是AC代码：

#include <iostream>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string>
#include <vector>
#include <list>
#include <map>
#include <queue>
#include <stack>
#include <bitset>
#include <algorithm>
#include <numeric>
#include <functional>
#define maxn 16000005
#define mod 100000007
#define cons 2

using namespace std;
typedef __int64 ll;
char a[maxn];
char name[30];
bool hashv[maxn];

int main()
{
int n,nc;
while(scanf("%d %d",&n,&nc)!=EOF){
memset(name,-1,sizeof(name));
memset(hashv,false,sizeof(hashv));
getchar();
scanf("%s",a);   //读取字符串
int sz=strlen(a),t=0;
for(int i=0;i<sz;i+=1){    //离散化
if(name[a[i]-'a']==-1){
//给出现过的字符编号，可以直接使用a[i]-'a'作为下标索引
name[a[i]-'a']=t++;
}
}

int tmp=0;
t=nc;
for(int i=0;i<n-1;i+=1){       //先计算前n-1个字符对应的数值
tmp=tmp*nc+name[a[i]-'a'];
t*=nc;
}

int countt=0;
for(int i=n-1;i<sz;i+=1){
tmp=(tmp*nc+name[a[i]-'a'])%t;
//这句话等效于tmp=tmp*nc-name[a[i-n]-'a']*t+name[a[i]-'a'];
if(!hashv[tmp]){
hashv[tmp]=true;
countt+=1;
/*string str(a,i-n+1,n);
cout<<str<<'\n';*/
}
}
printf("%d\n",countt);
}
return 0;
}

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航