您的位置：首页 > 编程语言 > Python开发

python三方库之BeautifuSoup

2017-03-28 22:22 260 查看

html文档解析的三方库beautifulsoup4

什么是beautifulsoup?

学习资源：https://www.crummy.com/software/BeautifulSoup/bs4/doc/index.zh.html

1.安装

pip install beautifulsoup4

2.使用

至少要对html有一定的了解。

from bs4 import BeautifulSoup

举例：获取一个页面中的所有链接

def get_link(url="http://www.zhihu.com"):
hrefs = []
html = urllib2.urlopen(url=url).read()
soup = BeautifulSoup(html, "html.parser")
for link in soup.find_all('a'):
href = link.get('href')
if not href.startswith('http'):
href = url + href
hrefs.append(href)
return hrefs

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

Python3.6中bs4.BeautifuSoup对象的findall：NoneType object is not callable
Python BeautifuSoup 库 mooc 中国大学学习
python 简单爬虫之网页解析器 beautifusoup4插件应用
Python BeautifuSoup4 爬表格
python 案例：使用BeautifuSoup4的爬虫
【爬虫】python+urllib+beautifusoup爬取花瓣网美女图片
python使用requests和beautifusoup模块爬取学校网站的就业中心信息，并发送至自己的邮箱
【Python】beautifusoup解析HTML并将数据写入文件
python3.5.2下安装BeautifuSoup--bs4
Python自动化（二）使用Beautifu Soup爬取电影下载链接
python爬虫实例（urllib&BeautifulSoup）
Python+BeautifulSoup抓取暴走漫画页面图片
Python 爬虫—— requests BeautifulSoup
python.beatifulsoup入门
python eclipse 插件安装及BeautifulSoup requests selenium在线安装 PhantomJS 安装环境配置
python BeautifulSoup的安装
python3+BeautifulSoup+tkinter 爬虫获取学校成绩
python 解析html 时lxml跟beautifulSoup对比
BeautifuSoup整理笔记
linux上安装BeatifulSoup(第三方python库)

新的分享

#新闻拍一拍# 微软推出 Pylance，改善 VS Code 中的 Python 体验
跟我学Python图像处理丨5种图像阈值化处理及算法对比
基于Python设计一个具有基本功能的通讯录
liunx上升级python2至python3
es的查询、排序查询、分页查询、布尔查询、查询结果过滤、高亮查询、聚合函数、python操作es
python常用标准库（时间模块time和datetime）
python之logging日志
python之configparser类的使用
Python常用标准库（pickle序列化和JSON序列化）
MySQL（12） - Python+MySQL读取写入图片
MySQL（11） - Python+MySQL开发新闻管理系统
Python 什么是flask框架？快速入门(flask安装，登录，新手三件套，登录认证装饰器，配置文件，路由系统，CBV)

章节导航