Server returned HTTP response code: 403 for URL: http://blog.csdn.net
2014-11-01 21:38
309 查看
在使用Jsoup抓取CSDN博客数据时候报http403错误,这是由于CSDN博客服务器设置了访问权限
如果是服务器端禁止抓取,那么这个你可以通过设置User-Agent来欺骗服务器
connection.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)");
利用这个原理,Jsoup代码稍作调整即可:
Connection connection =
Jsoup.connect(url);
connection.userAgent("Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)");
Document doc = connection.get();
如果是服务器端禁止抓取,那么这个你可以通过设置User-Agent来欺骗服务器
connection.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)");
利用这个原理,Jsoup代码稍作调整即可:
Connection connection =
Jsoup.connect(url);
connection.userAgent("Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)");
Document doc = connection.get();
相关文章推荐
- Server returned HTTP response code: 403 for URL: http://blog.csdn.net
- 关于Server returned HTTP response code: 403 for URL
- 通过设置代理,解决服务器禁止抓取,报“java.io.IOException: Server returned HTTP response code: 403 for URL”错误的方法
- java.io.IOException: Server returned HTTP response code: 403 for URL: http://
- java.io.IOException: Server returned HTTP response code: 403 for URL: http://
- java.io.IOException: Server returned HTTP response code: 403 for URL: http://start.spring.io
- 用户代理异常:java.io.IOException: Server returned HTTP response code: 403 for URL
- java.io.IOException: Server returned HTTP response code: 403 for URL: http://的解决办法
- 爬虫采坑: Server returned HTTP response code: 403 for URL:
- java.io.IOException: Server returned HTTP response code: 403 for URL
- java.io.IOException: Server returned HTTP response code: 403 for URL
- java.io.IOException: Server returned HTTP response code: 500 for URL
- Server returned HTTP response code: 500 for URL: http.......错误
- java.io.IOException: Server returned HTTP response code: 505 for URL: 问题
- 用户代理及 java.io.IOException: Server returned HTTP response code: 403 for URL
- hession开发遇到的问题 Server returned HTTP response code: 500 for URL:
- java.io.IOException: Server returned HTTP response code: 405 for URL: *********处理方法
- java.io.IOException: Server returned HTTP response code: 505 for URL: http://localhost:8080/fish/add
- [报错总结]java.io.IOException: Server returned HTTP response code: 500 for URL:
- java.io.IOException: Server returned HTTP response code: 500 for URL