新手学习爬虫,爬取简书网热评,其中就只有点赞数无法导入,以下为报错信息:
pymysql.err.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'like,reward) values ('462','7')' at line 1")
import requests from bs4 import BeautifulSoup from lxml import etree from multiprocessing import Pool import pymysql headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 ' \ '(KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36' } conn = pymysql.connect(host='localhost',user='root',passwd='123456',db='mydb',port=3306,charset='utf8') cursor = conn.cursor() def get_jianshu_info(url): res = requests.get(url,headers=headers) selector = etree.HTML(res.text) infos = selector.xpath('//ul[@class="note-list"]/li') for info in infos: try: title = info.xpath('div/a/text()')[0] author =info.xpath('div / div / a[1]/text()')[0] content = info.xpath('div/p/text()')[0].strip() comment = info.xpath('div/div/a/text()')[2].strip() if len(comment)==0: comment = '无' like = info.xpath('div/div/span[1]/text()')[0].strip() if len(like) == 0 and 11: like = '无' reward = info.xpath('div/div/span[2]/text()') if len(reward) == 0: reward = '无' else: reward = reward[0].strip() cursor.execute( "insert into jianshureping (like,reward) " "values (%s,%s)", (str(like),str(reward)) ) conn.commit() print('ok') except IndexError: print('error') if __name__ == '__main__': urls = \ ['https://www.jianshu.com/c/bDHhpK?order_by=commented_at&page={}'.format(str(i)) for i in range(1, 3)] for url in urls: get_jianshu_info(url)
我试过将点赞数去掉就可以导入,单独爬取发现点赞数中有换行符,我也用条件语句排除了,结果如下:
点赞数:['462', '21349', '无', '2885', '118', '60', '17', '436', '18', '4']
评论数:['112', '12572', '20', '179', '17', '46', '23', '237', '10', '6']
打赏数:['7', '121', '58', '8', '无', '2', '无', '2', '无', '1']
cursor.execute( "insert into jianshureping (like,reward) " "values (%s,%s)", (str(like),str(reward)) )
如果是插入 点赞数,评论数,打赏数,三个的话,你语句写的就有问题
如果语句改过的话,在确保数据没问题的情况下,那就是数据库表有问题,
可以在代码中打印一下sql语句,用可视化数据控软件,执行sql语句,查找问题的根源,注意在sql单双引号的问题
能不能具体说下怎么打印SQL语句呢?
@从MH到其他:
把你cursor.execute中执行的语句打印一下而已
@自说自话唉: 还是不太明白呢。不过我试了导入进Mongo就没有问题。