python3爬虫-urllib+BeautifulSoup
2018-02-05 22:49
471 查看
urllib
在Python2版本中,有urllib和urlib2两个库可以用来实现request的发送。而在Python3中,已经不存在urllib2这个库了,统一为urllib。Python3 urllib库包括了四个模块。urllib.request for opening and reading URLs
urllib.error containing the exceptions raised by urllib.request
urllib.parse for parsing URLs
urllib.robotparser for parsing robots.txt files
import urllib.request from bs4 import BeautifulSoup response = urllib.request.urlopen("http://www.biqukan.com/1_1094/") html = response.read().decode("gbk") div_bf = BeautifulSoup(html) div = div_bf.find_all('div', class_ = 'listmain') a_bf = BeautifulSoup(str(div[0])) a = a_bf.find_all('a') for each in a: print(each.string, each.get('href'))
相关文章推荐
- [爬虫] Python爬虫 urllib BeautifulSoup
- python简单爬虫开发(urllib2、requests + BeautifulSoup)
- python爬虫实例(urllib&BeautifulSoup)
- Python爬虫基础细节(urllib+cookielib+BeautifulSoup)
- Python爬虫利器二之Beautiful Soup的用法
- Python Beautiful Soup+requests实现爬虫
- Python爬虫利器二之Beautiful Soup的用法
- Python爬虫利器二之Beautiful Soup的用法
- Python爬虫入门之一-requests+BeautifulSoup
- 使用requests+beautifulsoup模块实现python网络爬虫功能
- Python爬虫利器二之Beautiful Soup的用法
- 【python小练】图片爬虫之BeautifulSoup4
- Python爬虫利器二之Beautiful Soup的用法
- 【Python爬虫】requests+Beautifulsoup存入数据库
- windows和linux下使用python2.7 urllib.urlopen+beautifulsoup打开12306网站订票页面表现不同,前者报错ssl认证失败,后者成功
- 爬虫 Python爬虫利器二之Beautiful Soup的用法
- python爬虫-html解析器beautifulsoup
- Python爬虫入门八之Beautiful Soup的用法
- Python爬虫利器二之Beautiful Soup的用法