您的位置:首页 > 其它

Count the string(kmp+dp)

2018-01-31 10:40 288 查看

Count the string

It is well known that AekdyCoin is good at string problems as well as number theory problems. When given a string s, we can write down all the non-empty prefixes of this string. For example:

s: "abab"

The prefixes are: "a", "ab", "aba", "abab"

For each prefix, we can count the times it matches in s. So we can see that prefix "a" matches twice, "ab" matches twice too, "aba" matches once, and "abab" matches once. Now you are asked to calculate the sum of the match times for all the prefixes. For "abab",
it is 2 + 2 + 1 + 1 = 6.

The answer may be very large, so output the answer mod 10007.

Input The first line is a single integer T, indicating the number of test cases.

For each case, the first line is an integer n (1 <= n <= 200000), which is the length of string s. A line follows giving the string s. The characters in the strings are all lower-case letters.

Output For each case, output only one number: the sum of the match times for all the prefixes of s mod 10007.
Sample Input
1
4
abab

Sample Output
6


如果求出next数组以每个子串再用kmp分别求个数肯定会超时。

这个题利用next数组的性质,利用dp,可以得到递推式dp[i] = dp[Next[i]] + 1求,下面具体解释这个递推式的含义和成立原因

dp[i]表示的是以下标为i-1的字母为结尾的任意前缀子串的个数,为什么是i-1,因为字符串从0开始记录,next数组next[i]记录的字符串长度是i实际是下标从0~i-1,同理dp[i]也是存在长度为i的前缀中也就是下标从0~i-1,以i-1号字符为结尾的任意前缀个数

这个公式为什么是对的呢

首先+1就是加上目前这个前缀字符串本身,因为我们又增加了一个字符嘛,比如a b c d  dp[3]存abc中以c结尾任意前缀长度

                                                                                                                        0 1 2 3

+1的目的就是先把abc这个本身加上

然后再来看dp[Next[i]],Next[i]是0~i-1子串s中公共前后缀串的最大长度,同时也是这个子串公共前缀的下一个下标p,

所以dp[p]就代表了我们之前已经求完的,s中最长公共前缀(也是后缀)中以前缀最后一个字符为结尾的任意前缀个数(有点绕),同理他也是后缀的,那么后缀是不是和我们新加的i-1号字符相连,那么之前有多少个,再加上这个新字符是不是就有多少个新的字符了,然后再加1本身,就是以i-1号字符为结尾所有了前缀了

例如   0  1   2  3  4

          a   b  a  b

next  -1   0  0  1  2

dp      0   1  1  2  2

dp[1]=1->"a"   dp[2]=1->"ab"  dp[3]=2->"aba","a"  dp[3]=2->"abab","ab"

code:

#include <iostream>
#include <cstring>
#include <cstdio>
using namespace std;
const int MAXN = 200100;
const int MOD = 10007;
char s[MAXN];
int n;
int Next[MAXN];
int dp[MAXN];
int sum;
void getNext(){
int i = -1,j = 0;
Next[0] = -1;
int len = strlen(s);
while(j < len){
if(i == -1 || s[i] == s[j]){
i++,j++;
Next[j] = i;
}
else
i = Next[i];
}
}

int main(){
int t;
scanf("%d",&t);
while(t--){
scanf("%d",&n);
scanf("%s",s);
getNext();
int i;
memset(dp,0,sizeof(dp));
sum = 0;
for(i = 1; i &l
c3eb
t;= n; i++){
dp[i] = dp[Next[i]] + 1;
sum = (sum + dp[i])%MOD;
}
printf("%d\n",sum);
}
return 0;
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: