您的位置:首页 > 理论基础 > 计算机网络

java网络爬虫

2016-04-12 14:56 459 查看
import java.io.*;

import java.net.*;

import java.util.regex.*;

public class Calculator {

static String SendGet(String url) {

String result = “”;

BufferedReader in = null;

try {

URL realUrl = new URL(url);

URLConnection connection = realUrl.openConnection();

connection.connect();

in = new BufferedReader(new InputStreamReader(

connection.getInputStream(),”UTF-8”));

String line;

while ((line = in.readLine()) != null) {

result += line;


}

} catch (Exception e) {

System.out.println(“exception!” + e);

e.printStackTrace();

}

finally {

try {

if (in != null) {

in.close();

}

} catch (Exception e2) {

e2.printStackTrace();

}

}

return result;

}

static String RegexString(String targetStr, String patternStr) {

Pattern pattern = Pattern.compile(patternStr);

Matcher matcher = pattern.matcher(targetStr);

if (matcher.find()) {

return matcher.group(1);

}

return “Nothing”;

}

public static void main(String[] args) {

String url = “http://www.zhihu.com/explore/recommendations“;

String result = SendGet(url);

String imgSrc = RegexString(result, “question_link.+?>(.+?)<”);

System.out.println(imgSrc);

}

}

http://www.jb51.net/article/57197.htm
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  爬虫