您的位置:首页 > 编程语言 > Java开发

java 抓取网页内容

2011-03-16 15:26 441 查看
public static void main(String[] args) {

try {

URL url = new URL("http://www.google.com");

URLConnection conn = url.openConnection();

conn.setDoOutput(true);

InputStream in = null;

in = url.openStream();

String content = pipe(in, "utf-8");

System.out.println(content);

} catch (Exception e) {

e.printStackTrace();

}

}

static String pipe(InputStream in, String charset) throws IOException {

StringBuffer s = new StringBuffer();

if (charset == null || "".equals(charset)) {

charset = "utf-8";

}

String rLine = null;

BufferedReader bReader = new BufferedReader(new InputStreamReader(in,

charset));

PrintWriter pw = null;

FileOutputStream fo = new FileOutputStream("../index.html");

OutputStreamWriter writer = new OutputStreamWriter(fo, "utf-8");

pw = new PrintWriter(writer);

while ((rLine = bReader.readLine()) != null) {

String tmp_rLine = rLine;

int str_len = tmp_rLine.length();

if (str_len > 0) {

s.append(tmp_rLine);

pw.println(tmp_rLine);

pw.flush();

}

tmp_rLine = null;

}

in.close();

pw.close();

return s.toString();

}

相关技术帖子:http://blog.sina.com.cn/gzwncb
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: