scrapy中parse多次给callback传参数

[scrapy] 2024-04-29 圈点261

摘要:scrapy中parse多次给callback传参数。如上,可以灵活运用return request多次进行传递参数与循环抓取内容。

scrapy中parse多次给callback传参数,先来个示例:

def parse(self, response):

    item = MyItem()

    item['urla'] = response.url

    request = scrapy.Request("http://www.xoxxoo.com/article/show/i/277.html",

                             callback=self.parse_a)

    request.meta['item'] = item

    return request


def parse_a(self, response):

    item = response.meta['item']

    item['urlb'] = response.url

    return item


如上,可以灵活运用return request多次进行传递参数与循环抓取内容。


理论知识解析:

Scrapy请求对象参数:

1. url(string)

2. callback (函数)

3. method (string)默认为GET

4. meta(dict)Request.meta属性的初始值

5. headers (dict)

6. cookies (dict or list)

7. encoding (string)

8. priority (int) (scheduler安排的优先级,默认都是0)

9. dont_filter (boolean)

10. errback (callable) 


Scrapy Response对象参数:

1. url (string)

2. headers(dict)

3. status(integer) 例如200, 404等

4. body(str)

5. meta(dict)

6. flags(list)  例如cached’, ‘redirected‘



parse  callback  

感谢反馈,已提交成功,审核后即会显示