首页 新闻 搜索 专区 学院

scrapy如何手动停止爬虫?

0
[待解决问题]

没用代理ip前我都是ctrl+c停下爬虫,用了代理,ctrl+c停不下来,下面是ctrl+c停止后的信息
2021-05-10 09:46:03 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.b2b168.com/c168-19142095.html> (failed 1 times): [<twisted.pyth
on.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2021-05-10 09:46:03 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.b2b168.com/c168-19146912.html> (failed 1 times): [<twisted.pyth
on.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2021-05-10 09:46:03 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.b2b168.com/c168-19185028.html> (failed 1 times): An error occur
red while connecting: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other sid
e was lost in a non-clean fashion: Connection lost.
2021-05-10 09:46:03 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.b2b168.com/c168-19146450.html> (failed 1 times): [<twisted.pyth
on.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2021-05-10 09:46:54 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): dps.kdlapi.com:443
2021-05-10 09:46:54 [urllib3.connectionpool] DEBUG: https://dps.kdlapi.com:443 "GET /api/getdps/?orderid=962036615546243&num=1&pt=1&format=json&sep=1 HTT
P/1.1" 200 None
2021-05-10 09:47:54 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): dps.kdlapi.com:443
2021-05-10 09:47:54 [urllib3.connectionpool] DEBUG: https://dps.kdlapi.com:443 "GET /api/getdps/?orderid=962036615546243&num=1&pt=1&format=json&sep=1 HTT
P/1.1" 200 None
2021-05-10 09:48:54 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): dps.kdlapi.com:443
2021-05-10 09:48:55 [urllib3.connectionpool] DEBUG: https://dps.kdlapi.com:443 "GET /api/getdps/?orderid=962036615546243&num=1&pt=1&format=json&sep=1 HTT
P/1.1" 200 None
2021-05-10 09:49:55 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): dps.kdlapi.com:443
2021-05-10 09:49:55 [urllib3.connectionpool] DEBUG: https://dps.kdlapi.com:443 "GET /api/getdps/?orderid=962036615546243&num=1&pt=1&format=json&sep=1 HTT
P/1.1" 200 None

Acheng1011的主页 Acheng1011 | 初学一级 | 园豆:26
提问于:2021-05-10 10:19
< >
分享
所有回答(2)
0
ps -aux |grep 你的spider名称,然后kill -9 掉```
小小咸鱼YwY | 园豆:2918 (老鸟四级) | 2021-05-10 10:36

需要另开一个窗口啊

支持(0) 反对(0) Acheng1011 | 园豆:26 (初学一级) | 2021-05-10 10:37

@abcd12,,: scrapydweb 去了解一下

支持(0) 反对(0) 小小咸鱼YwY | 园豆:2918 (老鸟四级) | 2021-05-10 10:40

@小小咸鱼YwY: 好的,谢谢

支持(0) 反对(0) Acheng1011 | 园豆:26 (初学一级) | 2021-05-10 10:42
0

看你的爬虫启动方式,如果是Teminal里命令启动需要去杀掉对应的进程,如果是配置的ide启动,直接关掉就行。如果是脚本启动也是需要去后台杀掉对应进程的,另外scrapy支持signal传递信息,可以通过signal设定启动触发和关闭触发,自己定义个关闭脚本其实也可以。

写爬虫的后端是好前端 | 园豆:204 (菜鸟二级) | 2021-05-10 17:32

好的,感谢

支持(0) 反对(0) Acheng1011 | 园豆:26 (初学一级) | 2021-05-11 08:47

@abcd12,,:
可以尝试这种方式
from scrapy import cmdline
if name == 'main':
cmdline.execute('scrapy crawl 爬虫名'.split()),这种可以直接调用python解释器启动,直接关掉解释器就可以关闭爬虫

@写爬虫的后端是好前端: 这个我知道,不过我都是在控制台输入命令启动的,自从加了代理IP使用ctrl+c不管用时,我就把pycharm关闭了再重新打开

支持(0) 反对(0) Acheng1011 | 园豆:26 (初学一级) | 2021-05-11 08:58

@abcd12,,: 既然是调试启动,上面写的这个方式可以想启动就启动,想停就停,会更方便些,用命令行效率太低了

@写爬虫的后端是好前端: 行,我用一下,感谢

支持(0) 反对(0) Acheng1011 | 园豆:26 (初学一级) | 2021-05-11 09:02
清除回答草稿
   您需要登录以后才能回答,未注册用户请先注册