您的位置:首页 > 运维架构

Scrapy pipeline spider_opened and spider_closed not being called

2014-05-15 18:50 585 查看
http://stackoverflow.com/questions/4113275/scrapy-pipeline-spider-opened-and-spider-closed-not-being-called


Scrapy
pipeline spider_opened and spider_closed not being called






up
vote5down
votefavorite

2

I am having some trouble with a scrapy pipeline. My information is being scraped form sites ok and the process_item method is being called correctly. However the spider_opened and spider_closed methods are not being called.
class MyPipeline(object):

def __init__(self):
log.msg("Initializing Pipeline")
self.conn = None
self.cur = None

def spider_opened(self, spider):
log.msg("Pipeline.spider_opened called", level=log.DEBUG)

def spider_closed(self, spider):
log.msg("Pipeline.spider_closed called", level=log.DEBUG)

def process_item(self, item, spider):
log.msg("Processsing item " + item['title'], level=log.DEBUG)


Both the
__init__
and
process_item
logging
messages are displyed in the log, but the
spider_open
and
spider_close
logging
messages are not.

I need to use the spider_opened and spider_closed methods as I want to use them to open and close a connection to a database, but nothing is showing up in the log for them.

If anyone has any suggested that would be very useful.

python pipeline scrapy
share|improve
this question
asked Nov 6 '10 at 13:29




Jim Jeffries

2,33511943

add
comment


2 Answers

activeoldestvotes

up
vote6down
voteaccepted
Sorry, found it just after I posted this. You have to add:
dispatcher.connect(self.spider_opened, signals.spider_opened)
dispatcher.connect(self.spider_closed, signals.spider_closed)


in
__init__
otherwise
it never receives the signal to call it

share|improve
this answer
answered Nov 6 '10 at 13:39




Jim Jeffries

2,33511943

Thanks for your answer, but where do you get the
dispatcher
variable?
And how come I can't find this indoc.scrapy.org/en/latest/topics/item-pipeline.html?
:( – wrongusername Oct
8 '12 at 18:05
2
For this to work, you need to make sure that you import the following things:
from
scrapy.xlib.pydispatch import dispatcher
from
scrapy import signals
herrherr Oct
28 '13 at 15:08
http://codego.net/215885/
我有一个scrapy管道的trouble。被刮掉形式的网站确定,是否被正确调用。然而spider_opened和不被调用。
class MyPipeline(object):

def __init__(self):
log.msg("Initializing Pipeline")
self.conn = None
self.cur = None

def spider_opened(self, spider):
log.msg("Pipeline.spider_opened called", level=log.DEBUG)

def spider_closed(self, spider):
log.msg("Pipeline.spider_closed called", level=log.DEBUG)

def process_item(self, item, spider):
log.msg("Processsing item " + item['title'], level=log.DEBUG)


无论是
__init__
process_item
被持续显示在日志中,但
spider_open
spider_close
不是。
我需要的spider_opened和我希望他们能够打开和关闭到数据库的连接,但没有被显示在日志中他们。 如果任何人有任何建议,将

本文地址 :CodeGo.net/215885/

-------------------------------------------------------------------------------------------------------------------------

1. 对不起,发现它只是以后我张贴了这个。你必须添加:
dispatcher.connect(self.spider_opened, signals.spider_opened)
dispatcher.connect(self.spider_closed, signals.spider_closed)


__init__
否则它永远不会收到叫它信号

本文标题 :Scrapy管道spider_opened和spider_closed没有被调用

本文地址 :CodeGo.net/215885/
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: