您的位置:首页 > 编程语言 > Python开发

python爬取百度搜索答案题目和摘要

2017-04-11 16:28 274 查看
url就自行构造吧

# coding:utf-8

import urllib2

import re

from bs4 import BeautifulSoup

url = 'http://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=1&rsv_idx=1&tn=baidu&wd=johnkey&oq=john&rsv_pq=88bbfd770000beed&rsv_t=be24xj7KYq9tbjeRa7Fu10sW1dFF0GNZI1%2FW31Bq8OsZWZIwSpuRZxdcfQo&rqlang=cn&rsv_enter=1&inputT=787&rsv_sug3=12&rsv_sug1=7&rsv_sug7=100&rsv_sug2=0&rsv_sug4=787'

request = urllib2.Request(url)

request.add_header('User-Agent','Mozilla/5.0')

response = urllib2.urlopen(request)

html = response.read()

soup = BeautifulSoup(html,'html.parser',from_encoding='utf-8')

links = soup.find_all('div',id=re.compile(r'\d+'))

for link in links:

    print link.name,link['id'],link.get_text()
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  python 百度 搜索 url