Accelerated C++ .P110 find_urls
2010-10-13 14:30
148 查看
/* * Accelerated C++ .P110 * 查找一个文档中的所有有效url地址 * 输入:const string& s 包含文档所有内容的字符串s * 输出:vecotr<string> 所有有效url构成的集合 */ #include<iostream> #include<string> #include<vector> #include<algorithm> #include<cctype> using std::vector; using std::string; using std::cin; using std::cout; using std::endl; string::const_iterator url_end(string::const_iterator b,string::const_iterator); string::const_iterator url_beg(string::const_iterator b,string::const_iterator e); vector<string> find_urls(const string& s) { vector<string> ret; typedef string::const_iterator iter; iter b = s.begin(); iter e = s.end(); while(b!=e) { b = url_beg(b,e); if(b != e) { iter after = url_end(b,e); ret.push_back(string(b,after)); b=after; } } return ret; } bool not_url_char(char c) { static const string url_ch = "~;/?:@=&$-_.+!*(),"; return !(isalnum(c)||find(url_ch.begin(),url_ch.end(),c)!=url_ch.end()); } string::const_iterator url_end(string::const_iterator b,string::const_iterator e) { return find_if(b,e,not_url_char); } string::const_iterator url_beg(string::const_iterator b,string::const_iterator e) { static const string sep = "://"; typedef string::const_iterator iter; iter i = b; while((i = search(i,e,sep.begin(),sep.end()))!=e) { if(i != b && i+sep.size() != e) { iter beg = i; while(beg != b && isalpha(beg[-1])) --beg; if(beg != i && i+sep.size() != e &&!not_url_char(i[sep.size()])) return beg; } if( i!= e) i += sep.size(); } return e; } int main(void) { freopen("1.htm","r",stdin); string content; string str; while(getline(cin,str)) { content += str; } vector<string> urls = find_urls(content); for(vector<string>::iterator iter = urls.begin(); iter != urls.end(); iter++) { cout<<*iter<<endl; } return 0; }
相关文章推荐
- [LinkedIn]Find top 10 urls / shared links map reduce
- [CareerCup] 10.6 Find Duplicate URLs 找重复的URL链接
- find a job in HK(urls)
- Free HTTP Sniffer: a free HTTP packet sniffer to find the URLs.
- Windows API一日一练(58)FindFirstFile和FindNextFile函数
- setContentView切换页面(无需每次都findViewById)-----二
- Find Minimum in Rotated Sorted Array Total
- Find The Multiple
- 关于SpringMVC MockMvc测试 Can't find bundle for base name javax.servlet.LocalStrings, locale zh_CN错误
- FAQ(29):Cannot find class [com.smbms.pojo.UserServiceImpl] for bean with name 'userService' defined
- 完美数遍历(Find perfect number)
- 连接数库失败could not find driver Fatal error: Call to a member function prepare() on a non-object in D:\AppServ\www\xsphp_code\brophp\bases\dpdo.class.php
- FindBug安装和使用
- sublime text ctags 不能正常跳转 can't find any relevent
- hdu 4003 Find Metal Mineral(树形dp+分组背包)
- JQuery瞬间回想-3(children和find区别)
- find命令高级应用
- hibernatetemplate find 使用
- cannot find class [xxx] for bean with name
- Could not find the main class: org.elasticsearch.bootstrap.Elasticsearch. Program will exit.