网页抓取神器scrapy的安装搭建
2015-06-27 12:57
253 查看
1,安装pyhton2.7.x
#wget http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tgz #tar xvf Python-2.7.3.tgz #cd Python-2.7.3 #./configure #make && make install
验证python
[root@~]# python Python 2.7.3 (default, Feb 28 2013, 03:08:43) [GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> exit()
2,安装setuptools
http://pypi.python.org/packages/source/s/setuptools/setuptools-0.6c11.tar.gz#wget http://pypi.python.org/packages/source/s/setuptools/setuptools-0.6c11.tar.gz #tar zxvf setuptools-0.6c11.tar.gz #cd setuptools-0.6c11 #python2.7 setup.py install
3,安装Twisted
#cd setuptools-0.6c11 #easy_install Twisted
4,安装w3lib
#cd setuptools-0.6c11 #easy_install w3lib
5,安装libxml2或者用easy_install安装lxml
#cd setuptools-0.6c11 #easy_install lxml
验证lxml安装
[root@~]# python Python 2.7.3 (default, Feb 28 2013, 03:08:43) [GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import lxml >>> exit()
6,安装pyOpenSSL(这个是可选安装的,主要为了使scrapy能够支持https)
#wget http://launchpadlibrarian.net/58498441/pyOpenSSL-0.11.tar.gz #tar zxvf pyOpenSSL-0.11.tar.gz #cd pyOpenSSL-0.11 #python setup.py install
7,安装pip
#cd setuptools-0.6c11 #easy_install pip
8,pip安装scrapy
#pip install scrapy
验证安装
[root@~]# scrapy Scrapy 0.16.4 - no active project Usage: scrapy <command> [options] [args] Available commands: fetch Fetch a URL using the Scrapy downloader runspider Run a self-contained spider (without creating a project) settings Get settings values shell Interactive scraping console startproject Create new project version Print Scrapy version view Open URL in browser, as seen by Scrapy [ more ] More commands available when run from project directory Use "scrapy <command> -h" to see more info about a command
相关文章推荐
- #define 的高级用法
- Spring 配置 工程Web.xml中encodingFilter
- Android -常见面试提问
- hibernate查询之Criteria实现分页方法(GROOVY语法)
- 环境变量设置错误导致全部命令无法使用解决办法
- 第十六周oj刷题——Problem E: B 构造函数和析构函数
- 3.24
- Linux中的常用内存问题检测工具
- BIPlatform高级功能之 基于WebService的维度建模
- Struts2的值栈
- 3.23
- 3.22
- CSS3绘图实例代码
- VLAN
- CSS3动画 transition和animation的用法和区别
- swift学习
- java 遍历arrayList的四种方法
- 十分钟理解Actor模式
- Linux---修改用户的密码
- django1.8数据库显示时间比本地时间小8个小时怎么解决