爬取淘宝商品信息时遇到的问题

[待解决问题]

书上是用的xpath方法对Elements中的内容直接爬取。可是实际操作我发现不能直接对Elements中的内容使用xpath或者re方法，下面的代码并不能爬取到内容。

from lxml import etree
import requests
import re

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 ' \
                               '(KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36',
}


url = 'https://s.taobao.com/search?q=%E7%BE%8E%E9%A3%9F&s=88' res = requests.get(url,headers=headers) selector = etree.HTML(res.text) infos = selector.xpath('//div[@class="item J_MouserOnverReq"]') print(infos) for info in infos: data = info.xpath('div[2]/div[1]/div[1]/strong/text()')[0] sell = info.xpath('div[2]/div[1]/div[2][@class="deal-cnt"]/text()')[0] address = info.xpath('div[2]/div[3]/div[2]/text()')[0]

但是如果对源代码进行re匹配就可以。

sell = re.findall(r'\"view_price\"\:\"([\d.]*)\"',res.text,re.S)
print(sell)

这是为什么呢？

从MH到其他 | 初学一级 | 园豆：140
提问于：2018-07-23 14:26

< >

清除回答草稿

您需要登录以后才能回答，未注册用户请先注册。

欢迎，请先 登录 或者 注册 。

爬取淘宝商品信息时遇到的问题

欢迎，请先登录或者注册。