UVa123 Searching Quickly
2013-08-05 08:54
471 查看
Searching Quickly |
Background
Searching and sorting are part of the theory and practice of computerscience. For example, binary search provides a good example of aneasy-to-understand algorithm with sub-linear complexity. Quicksort isan efficient[average case] comparison based sort.
KWIC-indexing is an indexing method that permits efficient ``humansearch'' of, for example, a list of titles.
The Problem
Given a list of titles and a list of ``words to ignore'', you are towrite a program that generates a KWIC (Key Word In Context) index of thetitles. In a KWIC-index, a title is listed once for each keyword thatoccurs in the title. The KWIC-index is alphabetizedby keyword.
Any word that is not one of the ``words to ignore'' is a potentialkeyword.
For example, if words to ignore are``
the, of, and, as, a'' and the listof titles is:
Descent of Man The Ascent of Man The Old Man and The Sea A Portrait of The Artist As a Young Man
A KWIC-index of these titles might be given by:
a portrait of the ARTIST as a young man the ASCENT of man DESCENT of man descent of MAN the ascent of MAN the old MAN and the sea a portrait of the artist as a young MAN the OLD man and the sea a PORTRAIT of the artist as a young man the old man and the SEA a portrait of the artist as a YOUNG man
The Input
The input is a sequence of lines, the string::is used toseparate the list of words to ignore from the list of titles. Each ofthe words to ignore appears in lower-case letters on a line by itselfand is no more than 10 characters in length.
Each title appears on aline by itself and may consist of mixed-case (upper and lower) letters.Words in a title are separated by whitespace. No title contains morethan 15 words.
There will be no more than 50 words to ignore, no more than than 200titles, and no more than 10,000 characters in the titles and words toignore combined. No characters other than 'a'-'z', 'A'-'Z', and whitespace will appear in the input.
The Output
The output should be a KWIC-index of the titles, with each titleappearing once for each keyword in the title, and with the KWIC-indexalphabetized by keyword. If a word appears more than once in a title,each instance is a potential keyword.The keyword should appear in all upper-caseletters. All other words in a title should be in lower-case letters.Titles in the KWIC-index with the same keyword should appear in the sameorder as they appeared in the input file. In the case where multipleinstances
of a word are keywords in the same title, the keywords shouldbe capitalized in left-to-right order.
Case (upper or lower) is irrelevant when determining if a word is to beignored.
The titles in the KWIC-index need NOT be justified or aligned bykeyword, all titles may be listed left-justified.
Sample Input
is
the
of
and
as
a
but
::
Descent of Man The Ascent of Man The Old Man and The Sea A Portrait of The Artist As a Young Man
A Man is a Man but Bubblesort IS A DOG
Sample Output
a portrait of the ARTIST as a young man the ASCENT of man a man is a man but BUBBLESORT is a dog DESCENT of man a man is a man but bubblesort is a DOG descent of MAN the ascent of MAN the old MAN and the sea a portrait of the artist as a young MAN a MAN is a man but bubblesort is a dog a man is a MAN but bubblesort is a dog the OLD man and the sea a PORTRAIT of the artist as a young man the old man and the SEA a portrait of the artist as a YOUNG man
这题的大意就是找关键字,先是给出几行字符串作为非关键字,然后在之后的n行字符串中寻找非非关键字的字符,按字典序输出各关键字所在字符串,且除关键字大写外其他小写。这题主要是用到一个结构体存储各关键字及表示其在第i个字符串的第j个,在读入字符串时将各个单词分离开,与非关键字进行比较,若为关键字,即把此关键字及其坐标存入结构体中。而最后的输出则通过分段输出,当遇到此关键字的坐标时,将其小写转换为大写,以关键字长度为限,最终将结果输出。
#include <iostream>
#include <cstring>
#include <cstdio>
#include <cctype>
#include <algorithm>
using namespace std;
char ignore[60][20];
char title[210][10010];
struct Keyword {
char keyword[20];
int origin_1;
int origin_2;
}k[10010];
bool IsIgnored(char *tmp,char ignore[][20],int t) {
for (int i = 0; i < t; i++)
if (!strcmp(tmp,ignore[i]))
return true;
return false;
}
bool cmp(Keyword a,Keyword b) {
if (!strcmp(a.keyword,b.keyword))
if (a.origin_1 == b.origin_1)
return a.origin_2 < b.origin_2;
else
return a.origin_1 < b.origin_1;
else
return strcmp(a.keyword,b.keyword) < 0;
}
int main() {
memset(ignore,0,sizeof(ignore));
memset(title,0,sizeof(title));
memset(k,0,sizeof(k));
int t_1 = 0; //the number of ignored words.
while (cin >> ignore[t_1]) {
if (ignore[t_1][0] == ':' && ignore[t_1][1] == ':')
break;
t_1++;
}
int t_2 = 0; //the number of titles.
char tmp_1[10010]; //temporary titles.
getchar();
while (gets(tmp_1)) {
//if (tmp_1[0] == ':') ////
// break; ////
int len_1 = strlen(tmp_1);
for (int i = 0; i < len_1; i++)
if (isupper(tmp_1[i]))
tmp_1[i] = tolower(tmp_1[i]);
strcpy(title[t_2],tmp_1);
t_2++;
}
char tmp_2[20]; //temporary keywords.
int len_2 = 0; //the length of temporary keywords.
int t_3 = 0; //the number of keywords.
memset(tmp_2,0,sizeof(tmp_2));
for (int i = 0; i < t_2; i++) {
int cnt = true;
int len_3 = strlen(title[i]);
for (int j = 0; j < len_3; j++) {
if (isalpha(title[i][j])) {
tmp_2[len_2] = title[i][j];
len_2++;
cnt = true;
}
else if (cnt){
if (!IsIgnored(tmp_2,ignore,t_1)) {
strcpy(k[t_3].keyword,tmp_2);
k[t_3].origin_1 = i;
k[t_3].origin_2 = j - len_2;
t_3++;
}
memset(tmp_2,0,sizeof(tmp_2));
len_2 = 0;
cnt = false;
}
}
if (cnt) {
if (tmp_2[0] != '\0') {
if (!IsIgnored(tmp_2,ignore,t_1)) {
strcpy(k[t_3].keyword,tmp_2);
k[t_3].origin_1 = i;
k[t_3].origin_2 = len_3 - len_2;
t_3++;
}
memset(tmp_2,0,sizeof(tmp_2));
len_2 = 0;
}
}
}
sort(k,k+t_3,cmp);
for (int i = 0; i < t_3; i++) {
for (int j = 0; j < k[i].origin_2; j++)
cout << title[k[i].origin_1][j];
for (int j = k[i].origin_2; j < k[i].origin_2 + strlen(k[i].keyword); j++)
printf("%c",toupper(title[k[i].origin_1][j]));
for (int j = k[i].origin_2 + strlen(k[i].keyword); j < strlen(title[k[i].origin_1]); j++)
cout << title[k[i].origin_1][j];
cout << endl;
}
//for (int i = 0; i < t_3; i++)
// cout << k[i].keyword << " " << k[i].origin_1 << " " << k[i].origin_2 << endl;
return 0;
}
相关文章推荐
- UVa Problem 123 - Searching Quickly
- UVA - 123 Searching Quickly
- uva 123 - Searching Quickly
- UVa123 - Searching Quickly
- UVA 123 快速查找
- uva 123 Searching Quickly
- UVa123 Searching Quickly
- UVA 123
- uva 123 Searching Quickly(遍历+排序)
- uva 123
- STL --- UVA 123 Searching Quickly
- uva - 123 - Searching Quickly
- UVa - 123 - Searching Quickly
- uva 123 Searching Quickly
- uva 123 - Searching Quickly
- UVA 123 解题报告
- STL --- UVA 123 Searching Quickly
- uva 123 Searching Quickly(字符串排序处理)
- UVa 123 - Searching Quickly
- UVA 123 - Searching Quickly