您的位置：首页 > 编程语言 > Python开发

Python爬虫问题汇总(持续更新)

2018-02-04 05:33 447 查看

@分布式爬虫的slave端找不到scrapy_redis：

运行slave端时使用：sudo scrapy crawl spidername，或sudo scrapy runspider mycrawler_redis.py，总之sudo一下；

没sudo居然报找不到模块…没道理，蛋疼啊；

@分布式爬虫尝试连接远程redis被拒：

报错：redis.exceptions.ResponseError: DENIED Redis is running in protected mode…：

解决：https://www.cnblogs.com/nzbbody/p/6389619.html

@爬虫报连接丢失错误

报错：twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion.

被反爬了，要配置请求头或IP代理

@ubuntu16下安装chrome浏览器：

http://www.linuxidc.com/Linux/2016-05/131096.htm

@安装chromedriver和phantomjs：

https://www.cnblogs.com/Lin-Yi/p/7658001.html

chrome支持无头模式以后，phantomjs已然过时，不太有学的必要了

@chromedriver的版本与chrome版本要注意匹配，否则会报非法上下文错误（Runtime.executionContextCreated has invalid ‘context’）：

http://blog.csdn.net/c08762/article/details/70339587

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航