首页 新闻 搜索 专区 学院

10 11 12 13 19 22 24 25 27行显示无法未定义的变量,怎么解决?

0
悬赏园豆:20 [已关闭问题] 关闭于 2016-03-29 09:24
 1 #  coding:utf8
 2 
 3 from bs4 import BeautifulSoup
 4 
 5 class HtmlParser(object):
 6     
 7     def _get_new_urls(self, page_url, soup):
 8         links = soup.find_all('a', href=re.compild(r"/view/\d+\.htm"))  # @UndefinedVariable           
 9         for link in links:
10             new_url = link['href']
11             new_full_url = urlparse.urljoin(page_url, new _url)                
12             new_url.add(new_full_url)
13         return new_url
14     
15     
16     def _get_new_data(self, page_url, soup):
17         res_data = {}
18         
19         res_data['url'] = page_url
20         
21         title_node = soup.find('dd', class_="lemmaWgt-lemmaTitle-title) 
22         res_data['title'] = title_node.get_text()
23         
24         summary_node = soud.find('div', class_="<div class="lemmaSummary"")
25         res_data['summary'] = summary_node.get_text()
26         
27         return res_data
28         
29     def parse(self, page_url, html_cont):
30         if page_url is None or html_cont is None:
31             return
32         
33         soup = BeautifulSoup(html_cont, 'htm.parser', from_encodin='utf-8')
34         new_urls = self._get_new_urls(page_url, soup)
35         new_data = self._get_new_data(page_url, soup)
36         return new_urls, new_data
37         
38     
39     
40     
View Code

 

 

KirkZheng的主页 KirkZheng | 初学一级 | 园豆:116
提问于:2016-03-24 21:37
< >
分享
所有回答(1)
0

能不能把代码稍微格式化一下再传上来,你这样贴上来还得一行一行去理你的缩进

Rich.T | 园豆:3440 (老鸟四级) | 2016-03-25 14:16
清除回答草稿
   您需要登录以后才能回答,未注册用户请先注册