您的位置:首页 > 其它

网页抓取神器scrapy的安装搭建

2015-06-27 12:57 253 查看

1,安装pyhton2.7.x

#wget http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tgz #tar xvf Python-2.7.3.tgz
#cd Python-2.7.3
#./configure
#make && make install


验证python

[root@~]# python
Python 2.7.3 (default, Feb 28 2013, 03:08:43)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()


2,安装setuptools

http://pypi.python.org/packages/source/s/setuptools/setuptools-0.6c11.tar.gz

#wget http://pypi.python.org/packages/source/s/setuptools/setuptools-0.6c11.tar.gz #tar zxvf setuptools-0.6c11.tar.gz
#cd setuptools-0.6c11
#python2.7 setup.py  install


3,安装Twisted

#cd setuptools-0.6c11
#easy_install Twisted


4,安装w3lib

#cd setuptools-0.6c11
#easy_install w3lib


5,安装libxml2或者用easy_install安装lxml

#cd setuptools-0.6c11
#easy_install lxml


验证lxml安装

[root@~]# python
Python 2.7.3 (default, Feb 28 2013, 03:08:43)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import lxml
>>> exit()


6,安装pyOpenSSL(这个是可选安装的,主要为了使scrapy能够支持https)

#wget http://launchpadlibrarian.net/58498441/pyOpenSSL-0.11.tar.gz #tar zxvf pyOpenSSL-0.11.tar.gz
#cd pyOpenSSL-0.11
#python setup.py install


7,安装pip

#cd setuptools-0.6c11
#easy_install pip


8,pip安装scrapy

#pip install scrapy


验证安装

[root@~]# scrapy
Scrapy 0.16.4 - no active project

Usage:
scrapy <command> [options] [args]

Available commands:
fetch         Fetch a URL using the Scrapy downloader
runspider     Run a self-contained spider (without creating a project)
settings      Get settings values
shell         Interactive scraping console
startproject  Create new project
version       Print Scrapy version
view          Open URL in browser, as seen by Scrapy

[ more ]      More commands available when run from project directory

Use "scrapy <command> -h" to see more info about a command
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: