您的位置:首页 > 其它

POJ 1204 Word Puzzles AC自动机

2013-08-28 10:08 176 查看
Word Puzzles

Time Limit: 5000MSMemory Limit: 65536K
Total Submissions: 8353Accepted: 3160Special Judge
Description

Word puzzles are usually simple and very entertaining for all ages. They are so entertaining that Pizza-Hut company started using table covers with word puzzles printed on them, possibly with the intent to minimise their client's perception of any possible
delay in bringing them their order.

Even though word puzzles may be entertaining to solve by hand, they may become boring when they get very large. Computers do not yet get bored in solving tasks, therefore we thought you could devise a program to speedup (hopefully!) solution finding in such
puzzles.

The following figure illustrates the PizzaHut puzzle. The names of the pizzas to be found in the puzzle are: MARGARITA, ALEMA, BARBECUE, TROPICAL, SUPREMA, LOUISIANA, CHEESEHAM, EUROPA, H***AIANA, CAMPONESA.



Your task is to produce a program that given the word puzzle and words to be found in the puzzle, determines, for each word, the position of the first letter and its orientation in the puzzle.

You can assume that the left upper corner of the puzzle is the origin, (0,0). Furthemore, the orientation of the word is marked clockwise starting with letter A for north (note: there are 8 possible directions in total).

Input

The first line of input consists of three positive numbers, the number of lines, 0 < L <= 1000, the number of columns, 0 < C <= 1000, and the number of words to be found, 0 < W <= 1000. The following L input lines, each one of size C characters, contain the
word puzzle. Then at last the W words are input one per line.
Output

Your program should output, for each word (using the same order as the words were input) a triplet defining the coordinates, line and column, where the first letter of the word appears, followed by a letter indicating the orientation of the word according to
the rules define above. Each value in the triplet must be separated by one space only.
Sample Input
20 20 10
QWSPILAATIRAGRAMYKEI
AGTRCLQAXLPOIJLFVBUQ
TQTKAZXVMRWALEMAPKCW
LIEACNKAZXKPOTPIZCEO
FGKLSTCBTROPICALBLBC
JEWHJEEWSMLPOEKORORA
LUPQWRNJOAAGJKMUSJAE
KRQEIOLOAOQPRTVILCBZ
QOPUCAJSPPOUTMTSLPSF
LPOUYTRFGMMLKIUISXSW
WAHCPOIYTGAKLMNAHBVA
EIAKHPLBGSMCLOGNGJML
LDTIKENVCSWQAZUAOEAL
HOPLPGEJKMNUTIIORMNC
LOIUFTGSQACAXMOPBEIO
QOASDHOPEPNBUYUYOBXB
IONIAELOJHSWASMOUTRK
HPOIYTJPLNAQWDRIBITG
LPOINUYMRTEMPTMLMNBO
PAFCOPLH***AIANALBPFS
MARGARITA
ALEMA
BARBECUE
TROPICAL
SUPREMA
LOUISIANA
CHEESEHAM
EUROPA
H***AIANA
CAMPONESA

Sample Output
0 15 G
2 11 C
7 18 A
4 8 C
16 13 B
4 15 E
10 3 D
5 1 E
19 7 C
11 11 H

Source

Southwestern Europe 2002

1. 什么是AC自动机

AC的意思和KMP相似,是由Aho-Corasick这两个人创造的,用于多字符串匹配问题的算法。比如给你一个文本文件,再给你k个目标串,让你寻找这k个目标串是否存在在这个文件中。

2. 为什么要学习AC自动机

相信大家都了解KMP算法,它是用于单模式串的线性匹配算法。它的主要思想是当主串和模式串匹配不成功时,模式串不用从头开始匹配,而是回退到tk处,其中k为满足T0T1..Tk-1=Tj-k+1..Tj-Ttj的最大值。充分利用了模式串本身的性质。KMP的时间复杂度为O(m+k),m为主串长度,k为模式串长度。

若用KMP来做多模式串匹配,复杂度为O(m+k1+m+k2+...m+kk)=(n+km),k为模式串个数,n=sigma(ki),即模式串的总长度之和。可见在多模式串匹配中,采用KMP算法求解就不再是线性的了。哈哈!AC自动机派上用场了,它用于多模式串匹配问题,时间复杂度可以达到O(m+n+z),其中z为主串中模式串的总个数。是不是很有诱惑力?

3. 学习AC自动机需要的知识

要想学好AC自动机,需要真正弄懂KMP算法和Trie树(单词查找树)。

4. 如何构造AC自动机

构造AC自动机分两步:根据模式串构造Trie树;BFS创建失败指针。

所谓失败指针类似于KMP中的next数组,当主串在Trie上进行匹配时,如果当前节点不能继续匹配时,就应当退回到当前节点的失败指针所指向的节点。

在这里主要说下失败指针的构造:首先与根直接相邻的点的失败指针指向根节点,并入队列;设当前节点p1的子节点c1含字符C,沿着这个节点的失败指针走,一直走到某个节点p2,它的某个子节点c2含也字符C,那么把c1的失败指针指向c2,其含义是c1所代表的串的后缀和c2所代表的串的前缀相等且相同部分最长。

5. 在AC自动机上的查询

若当前主串的字符和Trie树上的匹配,看这个节点是否是某个串的结束标志,若是,记录这个节点(注意,还要继续根据该节点的失败指针继续查找,这是因为它的后缀也有可能是模式串,比如在找串yashe中,she和he是两个模式串,它会先找到she,再找到he)。然后沿着路径继续向下走,继续匹配下一个字符;若当前字符不匹配,则去当前节点的失败指针继续寻找;重复这两者中的任意一个,直到主串走到结尾为止。

题意:在一个r*c(r,c<=1000)的word puzzle中,寻找m个单词,输出单词的起始位置和方向(8个方向,从上开始顺时针,分别为ABCDEFGH)

分析:根据单词反向建立ac自动机。在puzzle以8个方向分别查询。

#include <iostream>
#include <cstdio>
#include <cstring>
#include <algorithm>
#include <queue>
#include <vector>
using namespace std;
const int Max = 1005;
int r,c,w;
char puzzle[Max][Max];
char wd[Max*3];
int len[Max];
int dir[8][2] = {{-1,0},{-1,1},{0,1},{1,1},{1,0},{1,-1},{0,-1},{-1,-1}};//up .... eight directions
char d[8] = {'E','F','G','H','A','B','C','D'};

struct Trie_Node
{
    Trie_Node* fail;
    Trie_Node* next[26];
    int value;
    Trie_Node()
    {
        value = 0;
        fail = NULL;
        memset(next,0,sizeof(next));
    }
};

void insertWord(Trie_Node* root, char* s, int len, int seq) //反向建立单词
{
    int del;
    Trie_Node* p = root;
    for(int i = len-1; i >= 0; i--)
    {
        del = s[i] - 'A';
        if(p->next[del] == NULL)
            p->next[del] = new Trie_Node();
        p = p->next[del];
    }
    p->value = seq;
}

void build_ac_automachine(Trie_Node* root)
{
    int i;
    queue<Trie_Node*> que;
    root->fail = NULL;
    for(i = 0; i < 26; i++)
    {
        if(root->next[i] != NULL)
        {
            root->next[i]->fail = root;
            que.push(root->next[i]);
        }
    }
    Trie_Node* now;
    while(!que.empty())
    {
        now = que.front();
        que.pop();
        for(i = 0; i < 26; i++)
        {
            if(now->next[i] == NULL)
                continue;
            Trie_Node* p = now->fail;
            while(p!=NULL&&p->next[i]==NULL)
                p = p->fail;
            if(p == NULL)
                now->next[i]->fail = root;
            else
                now->next[i]->fail = p->next[i];
            que.push(now->next[i]);
        }
    }
}

int X[Max],Y[Max],D[Max];

void SearchPatterns(Trie_Node* root, int i, int j, int k)
{
    int x=i,y=j;
    int del;
    Trie_Node* now = root;
    while(true)
    {
        if(x<0||x>=r||y<0||y>=c)
            break;
        del = puzzle[x][y]-'A';
        while(now->next[del]==NULL&&now!=root)
            now = now->fail;
        now = now->next[del];
        if(now == NULL)
            now = root;
        Trie_Node* p = now;
        while(p!=root&&p->value)
        {
            X[p->value] = x;
            Y[p->value] = y;
            D[p->value] = k;
            p->value = 0;
            p = p->fail;
        }
        x += dir[k][0];
        y += dir[k][1];
    }
}

int main()
{
    int i,j;
    Trie_Node* root = new Trie_Node();
    scanf("%d %d %d",&r,&c,&w);
    for(i = 0; i < r; i++)
        scanf("%s",puzzle[i]);
    for(i = 1; i <= w; i++)
    {
        scanf("%s",wd);
        len[i] = strlen(wd);
        insertWord(root,wd,strlen(wd),i);
    }
    build_ac_automachine(root);
    //8个方向上的查询
    for(i = 0; i < r; i++)
    {
        SearchPatterns(root,i,0,2);
        SearchPatterns(root,i,c-1,6);
        SearchPatterns(root,i,0,3);
        SearchPatterns(root,i,c-1,7);
        SearchPatterns(root,i,0,1);
        SearchPatterns(root,i,c-1,5);
    }
    for(j = 0; j < c; j++)
    {
        SearchPatterns(root,r-1,j,0);
        SearchPatterns(root,0,j,4);
        SearchPatterns(root,0,j,3);
        SearchPatterns(root,r-1,j,7);
        SearchPatterns(root,r-1,j,1);
        SearchPatterns(root,0,j,5);
    }
    for(i = 1; i <= w; i++)
        printf("%d %d %c\n",X[i],Y[i],d[D[i]]);
    return 0;
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: