您的位置:首页 > 理论基础 > 计算机网络

httpclient绕过登陆验证码抓取数据

2015-11-25 14:40 489 查看
session的保持是通过cookie来维持的,所以如果用户有勾选X天内免登录,这个session 就X天内一直有效,就是通过这个cookie来维护。如果没选X天内免登录,基本上就本次才能保持session,下次打开浏览器就要重新登录了。 
所以在web安全里,黑客通过XSS,最终目的就是获取cookie,从免登录直接进入系统。 

这次要讲的是,得到用户cookie后,免登录,用HttpClient保持原来session访问原本一定要登录才能做的事。 

HttpClient 4.x 库可以自己处理Cookie 
有两咱广度可以添加cookie, 
1.通过  httpclient.setCookieStore(cookieStore) 
2.通过  httpGet或者httpPost的addHeader(new BasicHeader("Cookie",cookie)); 

第一种, 
HttpClient是否在下次请求中携带从服务器端请求来的Cookie,完全是由设置决定的。 

httpclient.getParams.setParameter(ClientPNames.COOKIE_POLICY, CookiePolicy.BEST_MATCH) 或者CookiePolicy.BROWSER_COMPATIBILITY 
如果设置为Cookie策略为BEST_MATCH,或BROWSER_COMPATIBILITY的话,HttpClient会在请求中携带由服务器返回的Cookie。如果按照上面的写法,手动添加了CookieStore,那么就会在下次请求中夹带着两个Cookie,Cookie和Cookie2。 

如果设置为Cookie策略为默认的话,没设置,则需要手动通过 
httpclient.setCookieStore(cookieStore); 去设置. 

第二种, 
通过Header去设置cookie,这种方法,就是今天要用的应用场景, 
我们得到一个登录的cookie,免登录访问。 
可以用浏览器登录,然后f12通过console 执行document.cookie得到cookie, 
用这个cookie ,在访问时,设置  httpGet或者httpPost的addHeader(new BasicHeader("Cookie",cookie));就可以免登录访问。 
这种场景我用来用第一种方法,设置没成功,可能是因为用第一种时,没设置path,domain,expire的原因,我猪的。 

这种场景可以解决第一次登录也需要验证码的网站。没有登录就没办法发布或刷新信息。 
如赶集网。

我们直接用get方法访问网站绕过验证码

package com.artsoft.demo;

import java.util.Date;

import org.apache.http.HttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.message.BasicHeader;
import org.apache.http.util.EntityUtils;

import com.artsoft.util.DownloadUtil;
import com.artsoft.util.HtmlAnalyze;

public class Dajiewang {

public static void daJeWang() throws Exception {
// TODO Auto-generated method stub
DefaultHttpClient client = new DefaultHttpClient();
HttpResponse response = null;
System.out.println("******************************页面转向******************************");
String newUrl = "http://www.dajie.com/home";
HttpGet get = new HttpGet(newUrl);
get.addHeader(new BasicHeader("Cookie",
"DJ_UVID=MTQ0Njc3NTY0MTU1Mzc3MDc2; DJ_RF=empty; DJ_EU=http%3A%2F%2Fwww.dajie.com%2Fhome; login_email=764295333%40qq.com; dj_auth_v3=MW_qOtlnwl_JWoggzLsiIygjegD07-zT0hRU1DpC7Nwrsyf3qxtw-s9uPFHeds4*; uchome_loginuser=23860580; dj_cap=623eefeadd1d35d8d524c3a4c11e428f; USER_ACTION=request^AProfessional^ANORMAL^A-^A-; login_email=764295333%40qq.comHost:www.dajie.com"));
get.addHeader("Content-Type", "text/html;charset=UTF-8");
get.addHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0");
get.addHeader("Host", "www.dajie.com");
get.addHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
get.addHeader("Accept-Language", "zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3");
HttpResponse httpResponse = client.execute(get);
String responseString = EntityUtils.toString(httpResponse.getEntity());
// 登录后首页的内容
System.out.println(responseString);
get.releaseConnection();

}

public static void Weibo(String newUrl) throws Exception {
// TODO Auto-generated method stub
DefaultHttpClient client = new DefaultHttpClient();
HttpResponse response = null;
System.out.println("******************************页面转向******************************");
//		String newUrl = "http://data.weibo.com/index/ajax/getchartdata?month=default&__rnd=1091324464527";
HttpGet get = new HttpGet(newUrl);
// get.addHeader("Content-Type", "text/html;charset=UTF-8");
// get.addHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64;
// rv:26.0) Gecko/20100101 Firefox/26.0");
// get.addHeader("Host", "data.weibo.com");
// get.addHeader("Accept",
// "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
get.addHeader("Accept-Encoding", "gzip, deflate, sdch");
get.addHeader("Accept-Language", "zh-CN,zh;q=0.8");
get.addHeader("Connection", "keep-alive");
get.addHeader("Content-Type", "application/x-www-form-urlencoded");
get.addHeader(new Basi
4000
cHeader("Cookie",
"SINAGLOBAL=8549726845230.907.1445398578667; SUHB=0sqQ0pK3WBV2gN; DATA=usrmdinst_5; _s_tentry=-; Apache=7532238222192.973.1448331434936; ULV=1448331434952:13:8:1:7532238222192.973.1448331434936:1447378860051; SUB=_2AkMhD0eLdcNhrAFZmP0SzG3rbolXzQ7wu9_0M03fZ2JCMnoQgT5nqiRotBF_DN7Dt0e6al7NzPhNs71jebD5Fh4XHuaWFWw.; SUBP=0033WrSXqPxfM72wWs9jqgMF55529P9D9WFVId20mkyG_N-5ejfVKF0s5JpV2hMcShz4SKe0eXWpMC4odcXt; login_sid_t=19644dacc1b9296d1e5bcfad125de02c; WBStore=062485857e03170e|undefined; PHPSESSID=ffiim2vvu63quisbpkga00pap3; UOR=picture.youth.cn,widget.weibo.com,static.xiaomi.cn"));

get.addHeader("Host", "data.weibo.com");
get.addHeader("Referer", "http://data.weibo.com/index/hotword");
get.addHeader("User-Agent",
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36");
get.addHeader("X-Requested-With", "XMLHttpRequest");
// get.addHeader("Accept-Language",
// "zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3");

HttpResponse httpResponse = client.execute(get);
String responseString = EntityUtils.toString(httpResponse.getEntity());
// 登录后首页的内容
System.out.println(responseString);
get.releaseConnection();

}

public static void main(String[] args) throws Exception {
// TODO Auto-generated method stub
// DefaultHttpClient client = new DefaultHttpClient();
// HttpResponse response=null;
// System.out.println("******************************页面转向******************************");
// String newUrl="http://www.dajie.com/home";
// HttpGet get = new HttpGet(newUrl);
// get.addHeader(new
// BasicHeader("Cookie","DJ_UVID=MTQ0Njc3NTY0MTU1Mzc3MDc2; DJ_RF=empty;
// DJ_EU=http%3A%2F%2Fwww.dajie.com%2Fhome;
// login_email=764295333%40qq.com;
// dj_auth_v3=MW_qOtlnwl_JWoggzLsiIygjegD07-zT0hRU1DpC7Nwrsyf3qxtw-s9uPFHeds4*;
// uchome_loginuser=23860580; dj_cap=623eefeadd1d35d8d524c3a4c11e428f;
// USER_ACTION=request^AProfessional^ANORMAL^A-^A-;
// login_email=764295333%40qq.comHost:www.dajie.com"));
// get.addHeader("Content-Type", "text/html;charset=UTF-8");
// get.addHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64;
// rv:26.0) Gecko/20100101 Firefox/26.0");
// get.addHeader("Host", "www.dajie.com");
// get.addHeader("Accept",
// "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
// get.addHeader("Accept-Language",
// "zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3");
// HttpResponse httpResponse= client.execute(get);
// String responseString
// =EntityUtils.toString(httpResponse.getEntity());
// //登录后首页的内容
// System.out.println(responseString);
// get.releaseConnection();
String strHtml = DownloadUtil.getHtmlText("http://data.weibo.com/index/hotword?wid=1091324464527&wname=范冰冰",
1000 * 30, "UTF-8", null, null);
String timeDiff = HtmlAnalyze.getTagText(strHtml, "server_time': '", "'");
System.out.println(new Date());
System.out.println(timeDiff);

Date date = new Date(System.currentTimeMillis());
int s=0;
System.out.println(s=(int) (date.getTime()-Integer.parseInt(timeDiff)));

//		System.out.println(Integer.parseInt(timeDiff));
//		System.out.println(new Date()- new Date(Integer.parseInt(timeDiff));
Weibo("http://data.weibo.com/index/ajax/getchartdata?month=default&__rnd="+s);

//		System.out.println(new SimpleDateFormat("yyyy-MM-dd hh:mm:ss").format(new Date(1446912627104l)));

}

}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  java