您的位置:首页 > 其它

模拟网页行为之实践篇三

2016-11-21 17:55 316 查看
现在来谈下验证码图片的获取方式,带有验证码的地方都会附带有个刷新按钮,而刷新按钮的地方就是获取验证码网址代码。如果看过前面写的《模拟网页行为之工具篇》就会很容易定位到代码位置。定位到代码位置后看下图:



基本可以看到的是获取验证码的网址是:https://ipin.siren24.com/stickyCaptcha。但这还不够,因为前篇我们还讲过关于cookie的概念,需要带有cookie去刷新验证码才是有效的验证码,但如何获取cookie,看下图:



可见cookie类型是hostonly的,hostonly就是说只能在当前网页获取cookie。有了上述抓包分析,那么对验证码的流程有了基本的了解。

那么接下来的步骤分为

1.获取https://ipin.siren24.com/stickyCaptcha的cookie,。

2. 带cookie刷新验证码获取图片数据。

获取hostonly的cookie,c++代码实现如下:

std::string CWebLoginDlg::GetCookie( std::string url )
{
LPSTR lpszData = NULL;
DWORD dwSize=0;
lpszData= new char[1];
memset(lpszData,0, 1);

retry:	if (!InternetGetCookieA(url.c_str(), "", lpszData, &dwSize))
{
DWORD er = GetLastError();
if (er == ERROR_INSUFFICIENT_BUFFER)
{
delete []lpszData;
lpszData = new char[dwSize+1];
memset(lpszData,0,dwSize+1);
goto retry;
}
else
{
ATLTRACE("cookie is null");
}
}
std::string strCookieContent = std::string(lpszData, dwSize);
delete [] lpszData;
return strCookieContent;
}
参数即为:https://ipin.siren24.com/stickyCaptcha

若cookie为httponly类型,获取的方式也不一样,C++代码如下:

std::wstring CWebLoginDlg::GetCookieEx( std::wstring url )
{
LPWSTR lpszData = NULL;
DWORD dwSize=0;
lpszData= new wchar_t[1];
memset(lpszData,0, sizeof(wchar_t));

retry:	if (!InternetGetCookieEx(url.c_str(), L"JSESSIONID", lpszData, &dwSize, 0x00002000, NULL))
{
DWORD er = GetLastError();
if (er == ERROR_INSUFFICIENT_BUFFER)
{
delete []lpszData;
lpszData = new wchar_t[dwSize+1];
memset(lpszData,0,dwSize+1);
goto retry;
}
else
{
ATLTRACE("cookie is null");
}
}
std::wstring strCookieContent = std::wstring(lpszData, dwSize);
delete [] lpszData;
return strCookieContent;
}


刷新验证码图片数据,我采用的方式是用curl库,实际上所有网页走网络协议方式都可以借助curl来实现,但这里只单纯刷验证码图片数据。上面步骤把cookie获取到后,将其编辑成以下格式,然后将其作为参数cookie,传入到获取网页返回数据函数,c++代码表示如下:

std::string cookie = GetCookie("https://ipin.siren24.com/stickyCaptcha");
char nline[1024];
sprintf_s(nline, sizeof(nline),
"%s; domain=ipin.siren24.com; path=/; hostOnly", cookie.c_str());
m_pCurlClient->GetURLResource("https://ipin.siren24.com/stickyCaptcha", nline, ret);


GetURLResource实现如下:

struct MemoryStruct {
char *memory;
size_t size;
};

size_t CurlClient::WriteMemoryCallback(void *contents, size_t size, size_t nmemb, void *userp)
{
size_t realsize = size * nmemb;
if (userp == NULL)
{
return realsize;
}
struct MemoryStruct *mem = (struct MemoryStruct *)userp;
//ATLTRACE("222 chunk addr %x %d %d threaid %d", (DWORD)mem, mem->size, realsize, GetCurrentThreadId());

mem->memory = (char*)realloc(mem->memory, mem->size + realsize + 1);
if(mem->memory == NULL) {
/* out of memory! */
printf("not enough memory (realloc returned NULL)\n");
return 0;
}

memcpy(&(mem->memory[mem->size]), contents, realsize);
mem->size += realsize;
mem->memory[mem->size] = 0;

return realsize;
}

bool CurlClient::GetURLResource( std::string url, std::string cookie, std::string &rev)
{
bool ssl = (url.find("https") != std::string::npos);
struct MemoryStruct chunk;
chunk.memory = (char*)malloc(1);
chunk.size = 0;

CURL *curl;
CURLcode res;

curl = curl_easy_init();
if (curl)
{
if (!cookie.empty())
{
char nline[1024];
sprintf_s(nline, sizeof(nline),
"Set-Cookie: "
"%s", cookie.c_str());
res = curl_easy_setopt(curl, CURLOPT_COOKIELIST, nline);
}

curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
int agentIndex = m_Multi ? GetCurrentThreadId() % m_UserAgentList.size() : 0;
curl_easy_setopt(curl, CURLOPT_USERAGENT, m_UserAgentList[agentIndex].c_str());
if (ssl)
{
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0L);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 0L);
}
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, &CurlClient::WriteMemoryCallback);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, (void*)&chunk);
curl_easy_setopt(curl, CURLOPT_VERBOSE, 1);
curl_easy_setopt(curl, CURLOPT_NOSIGNAL, 1L);
curl_easy_setopt(curl, CURLOPT_FORBID_REUSE, 1); //多线程完成任务马上断开连接
curl_easy_setopt(curl, CURLOPT_TIMEOUT, 30);
curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, 15);
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1);
res = curl_easy_perform(curl);
if (res != CURLE_OK)
{
char curlerror[1024 * 5] = {0};
sprintf_s(curlerror, _countof(curlerror), "返回的信息 %s",curl_easy_strerror(res));
m_Error = curlerror;
}
rev = std::string(chunk.memory, chunk.size);
free(chunk.memory);
curl_easy_cleanup(curl);
}
return res == CURLE_OK;
}


以上,验证码的图片数据即可获取。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: