没用代理ip前我都是ctrl+c停下爬虫,用了代理,ctrl+c停不下来,下面是ctrl+c停止后的信息
2021-05-10 09:46:03 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.b2b168.com/c168-19142095.html> (failed 1 times): [<twisted.pyth
on.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2021-05-10 09:46:03 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.b2b168.com/c168-19146912.html> (failed 1 times): [<twisted.pyth
on.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2021-05-10 09:46:03 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.b2b168.com/c168-19185028.html> (failed 1 times): An error occur
red while connecting: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other sid
e was lost in a non-clean fashion: Connection lost.
2021-05-10 09:46:03 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.b2b168.com/c168-19146450.html> (failed 1 times): [<twisted.pyth
on.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2021-05-10 09:46:54 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): dps.kdlapi.com:443
2021-05-10 09:46:54 [urllib3.connectionpool] DEBUG: https://dps.kdlapi.com:443 "GET /api/getdps/?orderid=962036615546243&num=1&pt=1&format=json&sep=1 HTT
P/1.1" 200 None
2021-05-10 09:47:54 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): dps.kdlapi.com:443
2021-05-10 09:47:54 [urllib3.connectionpool] DEBUG: https://dps.kdlapi.com:443 "GET /api/getdps/?orderid=962036615546243&num=1&pt=1&format=json&sep=1 HTT
P/1.1" 200 None
2021-05-10 09:48:54 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): dps.kdlapi.com:443
2021-05-10 09:48:55 [urllib3.connectionpool] DEBUG: https://dps.kdlapi.com:443 "GET /api/getdps/?orderid=962036615546243&num=1&pt=1&format=json&sep=1 HTT
P/1.1" 200 None
2021-05-10 09:49:55 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): dps.kdlapi.com:443
2021-05-10 09:49:55 [urllib3.connectionpool] DEBUG: https://dps.kdlapi.com:443 "GET /api/getdps/?orderid=962036615546243&num=1&pt=1&format=json&sep=1 HTT
P/1.1" 200 None
ps -aux |grep 你的spider名称,然后kill -9 掉```
需要另开一个窗口啊
@abcd12,,: scrapydweb 去了解一下
@小小咸鱼YwY: 好的,谢谢
看你的爬虫启动方式,如果是Teminal里命令启动需要去杀掉对应的进程,如果是配置的ide启动,直接关掉就行。如果是脚本启动也是需要去后台杀掉对应进程的,另外scrapy支持signal传递信息,可以通过signal设定启动触发和关闭触发,自己定义个关闭脚本其实也可以。
好的,感谢
@abcd12,,:
可以尝试这种方式
from scrapy import cmdline
if name == 'main':
cmdline.execute('scrapy crawl 爬虫名'.split()),这种可以直接调用python解释器启动,直接关掉解释器就可以关闭爬虫
@写爬虫的后端是好前端: 这个我知道,不过我都是在控制台输入命令启动的,自从加了代理IP使用ctrl+c不管用时,我就把pycharm关闭了再重新打开
@abcd12,,: 既然是调试启动,上面写的这个方式可以想启动就启动,想停就停,会更方便些,用命令行效率太低了
@写爬虫的后端是好前端: 行,我用一下,感谢