Scrapy pipeline spider_opened and spider_closed not being called
2014-05-15 18:50
585 查看
http://stackoverflow.com/questions/4113275/scrapy-pipeline-spider-opened-and-spider-closed-not-being-called
activeoldestvotes
http://codego.net/215885/
我有一个scrapy管道的trouble。被刮掉形式的网站确定,是否被正确调用。然而spider_opened和不被调用。
无论是
我需要的spider_opened和我希望他们能够打开和关闭到数据库的连接,但没有被显示在日志中他们。 如果任何人有任何建议,将
本文地址 :CodeGo.net/215885/
-------------------------------------------------------------------------------------------------------------------------
1. 对不起,发现它只是以后我张贴了这个。你必须添加:
在
本文标题 :Scrapy管道spider_opened和spider_closed没有被调用
本文地址 :CodeGo.net/215885/
Scrapy
pipeline spider_opened and spider_closed not being called
up vote5down votefavorite 2 | I am having some trouble with a scrapy pipeline. My information is being scraped form sites ok and the process_item method is being called correctly. However the spider_opened and spider_closed methods are not being called.class MyPipeline(object): def __init__(self): log.msg("Initializing Pipeline") self.conn = None self.cur = None def spider_opened(self, spider): log.msg("Pipeline.spider_opened called", level=log.DEBUG) def spider_closed(self, spider): log.msg("Pipeline.spider_closed called", level=log.DEBUG) def process_item(self, item, spider): log.msg("Processsing item " + item['title'], level=log.DEBUG) Both the __init__and process_itemlogging messages are displyed in the log, but the spider_openand spider_closelogging messages are not. I need to use the spider_opened and spider_closed methods as I want to use them to open and close a connection to a database, but nothing is showing up in the log for them. If anyone has any suggested that would be very useful. python pipeline scrapy
| ||
add comment |
2 Answers
activeoldestvotesup vote6down voteaccepted | Sorry, found it just after I posted this. You have to add:dispatcher.connect(self.spider_opened, signals.spider_opened) dispatcher.connect(self.spider_closed, signals.spider_closed) in __init__otherwise it never receives the signal to call it
| ||||||||
|
我有一个scrapy管道的trouble。被刮掉形式的网站确定,是否被正确调用。然而spider_opened和不被调用。
class MyPipeline(object): def __init__(self): log.msg("Initializing Pipeline") self.conn = None self.cur = None def spider_opened(self, spider): log.msg("Pipeline.spider_opened called", level=log.DEBUG) def spider_closed(self, spider): log.msg("Pipeline.spider_closed called", level=log.DEBUG) def process_item(self, item, spider): log.msg("Processsing item " + item['title'], level=log.DEBUG)
无论是
__init__和
process_item被持续显示在日志中,但
spider_open和
spider_close不是。
我需要的spider_opened和我希望他们能够打开和关闭到数据库的连接,但没有被显示在日志中他们。 如果任何人有任何建议,将
本文地址 :CodeGo.net/215885/
-------------------------------------------------------------------------------------------------------------------------
1. 对不起,发现它只是以后我张贴了这个。你必须添加:
dispatcher.connect(self.spider_opened, signals.spider_opened) dispatcher.connect(self.spider_closed, signals.spider_closed)
在
__init__否则它永远不会收到叫它信号
本文标题 :Scrapy管道spider_opened和spider_closed没有被调用
本文地址 :CodeGo.net/215885/
相关文章推荐
- Storm运行出现Client is being closed, and does not take requests any more引起的Netty故障跟踪
- this install package could not be opened. verify that the package exists and that you can ac
- Most Common Solutions to FRM-41839 and .tmp Files Not Being Deleted
- scrapy在spider中通过pipeline获取数据库内容
- Scrapy CSRF cookies not accepted and results in a 302 Redirect
- MATLAB Toolbox Path Cache is out of date and is not being used的解决
- Most Common Solutions to FRM-41839 and .tmp Files Not Being Deleted
- ORA-01155: the database is being opened, closed, mounted or dismounted
- MATLAB Toolbox Path Cache is out of date and is not being used的解决
- Most Common Solutions to FRM-41839 and .tmp Files Not Being Deleted
- Most Common Solutions to FRM-41839 and .tmp Files Not Being Deleted
- Most Common Solutions to FRM-41839 and .tmp Files Not Being Deleted
- onTaskRemoved() not getting called in HUAWEI and XIOMI devices
- Most Common Solutions to FRM-41839 and .tmp Files Not Being Deleted
- MATLAB Toolbox Path Cache is out of date and is not being used的解决
- PIP安装时报The repository located at pypi.douban.com is not a trusted or secure host and is being ignore
- scrapy 为每个pipeline配置spider
- Scrapy:为spider指定pipeline
- ViewDidAppear/ViewWillAppear not being called
- scrapy 让指定的spider执行指定的pipeline