您的位置:首页 > 其它

抓取登录后的数据

2015-11-29 23:01 363 查看
这次是应一个客户需要,抓取另外一个网站的数据,包括数据提交。这些操作需要在登录之后完成。技术上没有什么难点。关键都是用fiddler找到参数和url。

记住登录状态

HttpClient能够记住登录状态的,登录完了可以讲Httpclient保存起来。

private HttpClient _client;
public HttpClient HttpClient
{
get
{
if (_client == null)
{
if (Session["Client"]!= null)
{
_client = Session["Client"] as HttpClient;
}
else
{
var handler = new HttpClientHandler
{
AutomaticDecompression = DecompressionMethods.GZip,
UseCookies = true,
Proxy =
new WebProxy("http://ip:8080/", true, null,
new NetworkCredential("username", "pwd", "domain"))
};//代理
_client = new HttpClient(handler);
_client.DefaultRequestHeaders.Add("user-agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36");
ClientLogin(new ClientLogoModel());
Session["Client"] = _client;
}

}

return _client;
}
}


因为目标网站都是用的json传的参数。也是用json返回的参数。不是form提交的格式。所以post之前也要将参数转成json。

public object ClientLogin(ClientLogoModel logoModel)
{
if (logoModel == null)
{
logoModel=new ClientLogoModel();

}
var data = JsonConvert.SerializeObject(logoModel); ;
var logoParams = new List<KeyValuePair<string, string>>();
logoParams.Add(new KeyValuePair<string, string>("data", data));
var response = _client.PostAsync(new Uri(LogonUrl), new FormUrlEncodedContent(logoParams)).Result;
var result = response.Content.ReadAsStringAsync().Result;
return result;
}


返回数据转化

从Fiddler左边获得Url,右边TextView上方是参数格式,下方是返回的数据格式。



每次都要转换,写成泛型。

public T GetTList<T>(object obj, string url)
{
var data = JsonConvert.SerializeObject(obj); ;
var paramList = new List<KeyValuePair<String, String>> { new KeyValuePair<string, string>("data", data) };
var response = HttpClient.PostAsync(new Uri(url), new FormUrlEncodedContent(paramList)).Result;

var result = response.Content.ReadAsStringAsync().Result;
return JsonConvert.DeserializeObject<T>(result);
}


调用:

public ActionResult TradePage(TradeQueryParm param)
{
var data = GetTList<TradeRequstResult>(obj, tradeListUrl);
return PartialView(data);
}


前端再将参数传递过来。

$.post("/Trade/TradePage", {
agentName: agentName, shortName: shortName,
startDate: startDate, endDate: endDate, page: cpage
}, function (data) {
$("#mtable").html(data);
}


HttpClient 上传图片:


private string UploadImage(string fileName,string path)
{
FileStream aFile = new FileStream(path, FileMode.Open);
MultipartFormDataContent form = new MultipartFormDataContent();
var  content = new StreamContent(aFile);
content.Headers.ContentType = new MediaTypeHeaderValue("image/jpeg");
content.Headers.ContentDisposition = new ContentDispositionHeaderValue("form-data")
{
Name = "protocolFile",
FileName = fileName
};
form.Add(content);
var response = HttpClient.PostAsync(imgLoadUrl, form).Result;
return response.Content.ReadAsStringAsync().Result;
}


内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: