您的位置:首页 > 其它

关于使用curl下载网页源码中文乱码问题!

2017-02-08 10:22 615 查看
 关于使用libcurl下载网页源码中文乱码问题!

参考了这位兄弟的:http://blog.csdn.net/malihong1/article/details/50480420,可能他没继续找到方法。

借用了http://www.cnblogs.com/iRoad/p/4105172.html的函数

直接改官网的demo https://curl.haxx.se/libcurl/c/example.html,https.c 代码如下:

/***************************************************************************
*                                  _   _ ____  _
*  Project                     ___| | | |  _ \| |
*                             / __| | | | |_) | |
*                            | (__| |_| |  _ <| |___
*                             \___|\___/|_| \_\_____|
*
* Copyright (C) 1998 - 2015, Daniel Stenberg, <daniel@haxx.se>, et al.
*
* This software is licensed as described in the file COPYING, which
* you should have received as part of this distribution. The terms
* are also available at https://curl.haxx.se/docs/copyright.html. *
* You may opt to use, copy, modify, merge, publish, distribute and/or sell
* copies of the Software, and permit persons to whom the Software is
* furnished to do so, under the terms of the COPYING file.
*
* This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY
* KIND, either express or implied.
*
***************************************************************************/
/* <DESC>
* Simple HTTPS GET
* </DESC>
*/
// C 运行时头文件
#include <atlstr.h>
#include <WinINet.h>
#include <string>

#include <stdio.h>
#include <curl/curl.h>
using namespace std;

string UTF8ToGBK(const std::string& strUTF8)
{
int len = MultiByteToWideChar(CP_UTF8, 0, strUTF8.c_str(), -1, NULL, 0);
WCHAR* wszGBK = new WCHAR[len+1];
memset(wszGBK, 0, len * 2 + 2);
MultiByteToWideChar(CP_UTF8, 0, (LPCSTR)(LPCTSTR)strUTF8.c_str(), -1, wszGBK, len);

len = WideCharToMultiByte(CP_ACP, 0, wszGBK, -1, NULL, 0, NULL, NULL);
char *szGBK = new char[len + 1];
memset(szGBK, 0, len + 1);
WideCharToMultiByte(CP_ACP,0, wszGBK, -1, szGBK, len, NULL, NULL);
std::string strTemp(szGBK);
delete[]szGBK;
delete[]wszGBK;
return strTemp;
}

/**
@brief  char*数据接收回调函数,仅适用stream实际类型是std::string
*/
static size_t DataReceiveCallback(void *ptr, size_t size, size_t nmemb, void *stream) {
size_t len = size * nmemb;

std::string* buffer = reinterpret_cast<std::string*>(stream);
if (buffer)
buffer->append((char*)ptr, len);

return len;
}

int main(void)
{
CURL *curl;
CURLcode res;

std::string data;
curl_global_init(CURL_GLOBAL_DEFAULT);

curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_URL, "https://v.qq.com/x/cover/jwplwx9ootoigud.html");

#ifdef SKIP_PEER_VERIFICATION
/*
* If you want to connect to a site who isn't using a certificate that is
* signed by one of the certs in the CA bundle you have, you can skip the
* verification of the server's certificate. This makes the connection
* A LOT LESS SECURE.
*
* If you have a CA cert for the server stored someplace else than in the
* default bundle, then the CURLOPT_CAPATH option might come handy for
* you.
*/
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0L);
#endif

#ifdef SKIP_HOSTNAME_VERIFICATION
/*
* If the site you're connecting to uses a different host name that what
* they have mentioned in their server certificate's commonName (or
* subjectAltName) fields, libcurl will refuse to connect. You can skip
* this check, but this will make the connection less secure.
*/
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 0L);
#endif

// 响应数据回调
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, DataReceiveCallback);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &data);

/* Perform the request, res will get the return code */
res = curl_easy_perform(curl);
/* Check for errors */
if(res != CURLE_OK)
fprintf(stderr, "curl_easy_perform() failed: %s\n",
curl_easy_strerror(res));

string strGBK = UTF8ToGBK(data);
/* always cleanup */
curl_easy_cleanup(curl);
}

curl_global_cleanup();

return 0;
}



另外因为使用了atl,会报错错误
1 error C1189: #error :  ATL requires C++ compilation (use a .cpp suffix)c:\program files (x86)\microsoft visual studio 10.0\vc\atlmfc\include\atlstr.h16

只需要经.c后缀改为cpp后缀就可以了。

我下载的curl:



官网下载curl,编译后(curl\curl-7.52.1\projects\Windows\下有win项目的解决方案),在这个目录下curl\curl-7.52.1\builds\libcurl-vc10-x86-release-dll-ipv6-sspi-winssl\ 存在三个文件夹

--bin

--curl.exe

--libcurl.dll(需要拷贝到运行目录)

include

--curl(头文件目录)

lib

--libcurl.lib

其他没怎么配置,项目需要配置引用curl路径目录,库目录。



内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: