XVII Open Cup named after E.V. Pankratiev. Eastern Grand Prix. Problem G. Gmoogle 模拟、字符串处理、文本搜索
2017-12-08 14:25
573 查看
XVII Open Cup named after E.V. Pankratiev. Eastern Grand Prix.
Problem G. Gmoogle
Input le: standard input
Output le: standard output
Time limit: 1 second
Memory limit: 256 megabytes
You are hired to create alpha version of the new searching engine named GMoogle. Alpha version should
work with the content, represented as a database of sentences:
• Content is merged into line S, consisting of characters `a'-`z', `A'-`Z', spaces, notation marks (\.!?")
(quotes are not counted) and decimal digits.
• If one of characters .!?" presents in the S, then it denotes the end of the sentence, except for
one special case: if rst non-space character after `.' is lowercase English letter, then it is an
abbreviation sign but not the end of the sentence; for example, string I like tea in a 500
ml. cup" contains one sentence, but strings Cup is 500 ml. I want it" and Cup is 500
ml. 500 ml is great for me" contains two sentences).
• First non-space character after the end of sentence is considered as the rst character of the new
sentence.
• word is contiguous sequence of characters `a'-'z', `A'-`Z', delimited by spaces, notation signs or
beginning/end of the sentence/string. It is guaranteed that digits can not be neighbors of the
letters, i.e. sequences like 10ml" or R2D2" are illegal.
• S may contain the sentences containing no words. It is guaranteed that S does not contains two or
more characters .!?" in a row.
After the content is indexed, users make requests. Each request can be represented as a string q, consisting
of one or more words (de nition of the word is given above). Words are separated by arbitrary number
of spaces (1 or more), heading and trailing spaces are possible.
Your program has to print all sentences from S, where all words from q are presented (in any order).
Words are considered equal, if all the letters at the corresponding positions are the same (case insensitive,
i.e. `B' and `b' are considered the same.
Input
First line of the input contains non-empty line S, consisting of no more than 1000 characters. Next line
contains one integer n (1 n 100) | number of the requests. Then n requests q1; : : : ; qn follow, each
on separate line in the format, described above. Note that in S and qi trailing and heading spaces are
allowed.
Output
For each request q1; q2; :::; qn print the request at the separate line. Then print the list of found sentences
in same order they present in S, one sentence per line. Requests and answers are printed in the quotes;
answers are preceeded by single `-' and single space; heading and trailing spaces must be eliminated.
Look the sample for clarify.
Example
standard input
Hello everyone. I want 2 coffee if
you have it. I like coffee very much.
4
HELLO
Coffee
much coffee
VoDka
standard output
Search results for "HELLO":
- "Hello everyone."
Search results for "Coffee":
- "I want 2 coffee if you have it."
- "I like coffee very much."
Search results for "much coffee":
- "I like coffee very much."
Search results for "VoDka":
Source
XVII Open Cup named after E.V. Pankratiev. Eastern
Grand Prix.
My Solution
题意:要求模拟一个搜索系统,给出文本,然后每次查询几个单词要求输出所以出现查询单词的句子。
模拟、字符串处理、文本搜索
先把文本预处理成一个一个单独的句子,并标号0、1、2......,并且用map<string, vector<int>>建立单词到句子的映射。
然后对于每个单独查询的每个单词都会有一个集合,然后对这些集合取一个交集就是答案了。
这里用到的求交集的方法是 是用一个map<int, int> check表示这些集合里每个句子出现的次数,最后遍历一遍check,
出现次数为查询的单词的个数的句子构成的集合就是所求的交集。
注意点:1、一个句子里可能出现几个相同的单词,建立映射的时候,一个单词只映射一次到该句子。
2、当'.'后面的第一个非空字符是小写字母时,这里不是句子的结束。
3、这里文本的最后一句可能没有标点符号且可能有很多空格,处理一下即可。
4、故意把文本处理成单个句子的方法是先拿出单独的句子,然后确定该句在此处结尾时,在建立这句的单词带这句话的映射。
5、无论是单词的映射还是查询,都全部用cctype里的isuppper和tolower来转化成小写字母进行比较。
时间复杂度 O(nlogn + k*qlogn)
空间复杂度 O(n)
#include <bits/stdc++.h>
using namespace std;
string s, word, line;
vector<string> senc;
map<string, vector<int>> mp;
map<int, int> check;
int main () {
#ifdef LOCAL
freopen("g.txt", "r", stdin);
#endif // LOCAL
getline(cin, s);
int n, sz = s.size(), i, j, len, cnt = 0, k;
while(s[sz-1] == ' '){
sz--;
}
bool flag;
for(i = 0; i < sz; i++){
if(s[i] == '.' || s[i] == '!' || s[i] == '?'){
flag = true;
if(s[i] == '.'){
for(j = i + 1; j < sz; j++){
if(islower(s[j])){
flag = false;//cout <<"?"<<endl;
break;
}
else if(s[j] != ' ' && s[j] != '\0'){
//cout << s[j] << " ? \n";
break;
}
}
}
if(!flag){
line += s[i];
continue;
}
len = line.size();
if(len != 0){
//cout << line << endl;
for(j = 0; j < len; j++){
if(islower(line[j])){
word += line[j];
}
else if(isupper(line[j])){
word += tolower(line[j]);
}
else if(!word.empty()){
if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
mp[word].push_back(cnt);
//cout << word << " " << cnt << endl;
word.clear();
}
}
if(!word.empty()){
if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
mp[word].push_back(cnt);
//cout << word << " " << cnt << endl;
word.clear();
}
line += s[i];
senc.push_back(line);
line.clear();
cnt++;
}
}
else{
if(line.size() == 0 && (s[i] == ' ' || s[i] == '\0')){ //!
;
}
else{
line += s[i];
}
}
}
len = line.size();
if(len != 0){
//cout << line << endl;
for(j = 0; j < len; j++){
if(islower(line[j])){
word += line[j];
}
else if(isupper(line[j])){
word += tolower(line[j]);
}
else if(!word.empty()){
if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
mp[word].push_back(cnt);
//cout << word << " " << cnt << endl;
word.clear();
}
}
if(!word.empty()){
if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
mp[word].push_back(cnt);
//cout << word << " " << cnt << endl;
word.clear();
}
//line += s[i];
senc.push_back(line);
line.clear();
cnt++;
}
/*
for(auto x = mp.begin(); x != mp.end(); x++){
cout << (x->first) << endl;
sz = (x->second).size();
for(i = 0; i < sz; i++){
cout << " " << (x->second)[i] ;
}
cout << endl;
}
cout << endl;
*/
cin >> n;
getchar();
for(i = 0; i < n; i++){
getline(cin, s);
cout << "Search results for \"" << s << "\":\n";
len = s.size();
cnt = 0;
for(j = 0; j < len; j++){
if(islower(s[j])){
word += s[j];
}
else if(isupper(s[j])){
word += tolower(s[j]);
}
else if(!word.empty()){
if(mp.find(word) != mp.end()){
sz = mp[word].size();
for(k = 0; k < sz; k++){
check[mp[word][k]]++;
}
}
cnt++;
word.clear();
}
}
if(!word.empty()){
if(mp.find(word) != mp.end()){
sz = mp[word].size();
for(k = 0; k < sz; k++){
check[mp[word][k]]++;
}
}
cnt++;
word.clear();
}
for(auto x = check.begin(); x != check.end(); x++){
if((x->second) == cnt){
cout << "- \"" << senc[x->first] << "\"\n";
}
}
check.clear();
}
}
Thank you!
------from ProLights
Problem G. Gmoogle
Input le: standard input
Output le: standard output
Time limit: 1 second
Memory limit: 256 megabytes
You are hired to create alpha version of the new searching engine named GMoogle. Alpha version should
work with the content, represented as a database of sentences:
• Content is merged into line S, consisting of characters `a'-`z', `A'-`Z', spaces, notation marks (\.!?")
(quotes are not counted) and decimal digits.
• If one of characters .!?" presents in the S, then it denotes the end of the sentence, except for
one special case: if rst non-space character after `.' is lowercase English letter, then it is an
abbreviation sign but not the end of the sentence; for example, string I like tea in a 500
ml. cup" contains one sentence, but strings Cup is 500 ml. I want it" and Cup is 500
ml. 500 ml is great for me" contains two sentences).
• First non-space character after the end of sentence is considered as the rst character of the new
sentence.
• word is contiguous sequence of characters `a'-'z', `A'-`Z', delimited by spaces, notation signs or
beginning/end of the sentence/string. It is guaranteed that digits can not be neighbors of the
letters, i.e. sequences like 10ml" or R2D2" are illegal.
• S may contain the sentences containing no words. It is guaranteed that S does not contains two or
more characters .!?" in a row.
After the content is indexed, users make requests. Each request can be represented as a string q, consisting
of one or more words (de nition of the word is given above). Words are separated by arbitrary number
of spaces (1 or more), heading and trailing spaces are possible.
Your program has to print all sentences from S, where all words from q are presented (in any order).
Words are considered equal, if all the letters at the corresponding positions are the same (case insensitive,
i.e. `B' and `b' are considered the same.
Input
First line of the input contains non-empty line S, consisting of no more than 1000 characters. Next line
contains one integer n (1 n 100) | number of the requests. Then n requests q1; : : : ; qn follow, each
on separate line in the format, described above. Note that in S and qi trailing and heading spaces are
allowed.
Output
For each request q1; q2; :::; qn print the request at the separate line. Then print the list of found sentences
in same order they present in S, one sentence per line. Requests and answers are printed in the quotes;
answers are preceeded by single `-' and single space; heading and trailing spaces must be eliminated.
Look the sample for clarify.
Example
standard input
Hello everyone. I want 2 coffee if
you have it. I like coffee very much.
4
HELLO
Coffee
much coffee
VoDka
standard output
Search results for "HELLO":
- "Hello everyone."
Search results for "Coffee":
- "I want 2 coffee if you have it."
- "I like coffee very much."
Search results for "much coffee":
- "I like coffee very much."
Search results for "VoDka":
Source
XVII Open Cup named after E.V. Pankratiev. Eastern
Grand Prix.
My Solution
题意:要求模拟一个搜索系统,给出文本,然后每次查询几个单词要求输出所以出现查询单词的句子。
模拟、字符串处理、文本搜索
先把文本预处理成一个一个单独的句子,并标号0、1、2......,并且用map<string, vector<int>>建立单词到句子的映射。
然后对于每个单独查询的每个单词都会有一个集合,然后对这些集合取一个交集就是答案了。
这里用到的求交集的方法是 是用一个map<int, int> check表示这些集合里每个句子出现的次数,最后遍历一遍check,
出现次数为查询的单词的个数的句子构成的集合就是所求的交集。
注意点:1、一个句子里可能出现几个相同的单词,建立映射的时候,一个单词只映射一次到该句子。
2、当'.'后面的第一个非空字符是小写字母时,这里不是句子的结束。
3、这里文本的最后一句可能没有标点符号且可能有很多空格,处理一下即可。
4、故意把文本处理成单个句子的方法是先拿出单独的句子,然后确定该句在此处结尾时,在建立这句的单词带这句话的映射。
5、无论是单词的映射还是查询,都全部用cctype里的isuppper和tolower来转化成小写字母进行比较。
时间复杂度 O(nlogn + k*qlogn)
空间复杂度 O(n)
#include <bits/stdc++.h>
using namespace std;
string s, word, line;
vector<string> senc;
map<string, vector<int>> mp;
map<int, int> check;
int main () {
#ifdef LOCAL
freopen("g.txt", "r", stdin);
#endif // LOCAL
getline(cin, s);
int n, sz = s.size(), i, j, len, cnt = 0, k;
while(s[sz-1] == ' '){
sz--;
}
bool flag;
for(i = 0; i < sz; i++){
if(s[i] == '.' || s[i] == '!' || s[i] == '?'){
flag = true;
if(s[i] == '.'){
for(j = i + 1; j < sz; j++){
if(islower(s[j])){
flag = false;//cout <<"?"<<endl;
break;
}
else if(s[j] != ' ' && s[j] != '\0'){
//cout << s[j] << " ? \n";
break;
}
}
}
if(!flag){
line += s[i];
continue;
}
len = line.size();
if(len != 0){
//cout << line << endl;
for(j = 0; j < len; j++){
if(islower(line[j])){
word += line[j];
}
else if(isupper(line[j])){
word += tolower(line[j]);
}
else if(!word.empty()){
if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
mp[word].push_back(cnt);
//cout << word << " " << cnt << endl;
word.clear();
}
}
if(!word.empty()){
if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
mp[word].push_back(cnt);
//cout << word << " " << cnt << endl;
word.clear();
}
line += s[i];
senc.push_back(line);
line.clear();
cnt++;
}
}
else{
if(line.size() == 0 && (s[i] == ' ' || s[i] == '\0')){ //!
;
}
else{
line += s[i];
}
}
}
len = line.size();
if(len != 0){
//cout << line << endl;
for(j = 0; j < len; j++){
if(islower(line[j])){
word += line[j];
}
else if(isupper(line[j])){
word += tolower(line[j]);
}
else if(!word.empty()){
if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
mp[word].push_back(cnt);
//cout << word << " " << cnt << endl;
word.clear();
}
}
if(!word.empty()){
if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
mp[word].push_back(cnt);
//cout << word << " " << cnt << endl;
word.clear();
}
//line += s[i];
senc.push_back(line);
line.clear();
cnt++;
}
/*
for(auto x = mp.begin(); x != mp.end(); x++){
cout << (x->first) << endl;
sz = (x->second).size();
for(i = 0; i < sz; i++){
cout << " " << (x->second)[i] ;
}
cout << endl;
}
cout << endl;
*/
cin >> n;
getchar();
for(i = 0; i < n; i++){
getline(cin, s);
cout << "Search results for \"" << s << "\":\n";
len = s.size();
cnt = 0;
for(j = 0; j < len; j++){
if(islower(s[j])){
word += s[j];
}
else if(isupper(s[j])){
word += tolower(s[j]);
}
else if(!word.empty()){
if(mp.find(word) != mp.end()){
sz = mp[word].size();
for(k = 0; k < sz; k++){
check[mp[word][k]]++;
}
}
cnt++;
word.clear();
}
}
if(!word.empty()){
if(mp.find(word) != mp.end()){
sz = mp[word].size();
for(k = 0; k < sz; k++){
check[mp[word][k]]++;
}
}
cnt++;
word.clear();
}
for(auto x = check.begin(); x != check.end(); x++){
if((x->second) == cnt){
cout << "- \"" << senc[x->first] << "\"\n";
}
}
check.clear();
}
}
Thank you!
------from ProLights
相关文章推荐
- XVII Open Cup named after E.V. Pankratiev. Eastern Grand Prix. Problem F. Buddy Numbers 贪心、数论、构造
- 【推导】【贪心】XVII Open Cup named after E.V. Pankratiev Stage 4: Grand Prix of SPb, Sunday, Octorber 9, 2016 Problem H. Path or Coloring
- 【找规律】【递归】XVII Open Cup named after E.V. Pankratiev Stage 4: Grand Prix of SPb, Sunday, Octorber 9, 2016 Problem F. Doubling
- XVII Open Cup named after E.V. Pankratiev Grand Prix of Moscow Workshops, Sunday, April 23, 2017 Problem K. Piecemaking
- Problem A. Array Factory XVII Open Cup named after E.V. Pankratiev Stage 4: Grand Prix of SPb, Sunday, Octorber 9, 2016
- 【分块】【暴力】XVII Open Cup named after E.V. Pankratiev Grand Prix of Moscow Workshops, Sunday, April 23, 2017 Problem I. Rage Minimum Query
- XVII Open Cup named after E.V. Pankratiev Grand Prix of Moscow Workshops, Sunday, April 23, 2017 Problem D. Great Again
- 【枚举】XVII Open Cup named after E.V. Pankratiev Stage 4: Grand Prix of SPb, Sunday, Octorber 9, 2016 Problem D. Cutting Potatoes
- XVII Open Cup named after E.V. Pankratiev. Grand Prix of America (NAIPC-2017)
- Problem D. Clones and Treasures XVII Open Cup named after E.V. Pankratiev||简单模拟
- 【二分】【字符串哈希】【二分图最大匹配】【最大流】XVII Open Cup named after E.V. Pankratiev Stage 14, Grand Prix of Tatarstan, Sunday, April 2, 2017 Problem I. Minimum Prefix
- XVIII Open Cup named after E.V. Pankratiev. Eastern Grand Prix
- Problem F. Matrix Game XVII Open Cup named after E.V. Pankratiev||字符串最大表示法
- XVII Open Cup named after E.V. Pankratiev. Eastern GP, Division 1
- 【动态规划】【滚动数组】【bitset】XVII Open Cup named after E.V. Pankratiev Stage 14, Grand Prix of Tatarstan, Sunday, April 2, 2017 Problem J. Terminal
- X Open Cup named after E.V. Pankratiev. European Grand Prix
- XVIII Open Cup named after E.V. Pankratiev. Grand Prix of Korea
- XVIII Open Cup named after E.V. Pankratiev. Grand Prix of Saratov
- XVIII Open Cup named after E.V. Pankratiev. Grand Prix of SPb
- 【找规律】【DFS】XVII Open Cup named after E.V. Pankratiev Stage 14, Grand Prix of Tatarstan, Sunday, April 2, 2017 Problem A. Arithmetic Derivative