python——基于煎蛋网的简单图片爬虫
2017-04-11 17:26
573 查看
import urllib.request import os def get_url(url): req = urllib.request.Request(url) req.add_header('User-Agent','Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36') response = urllib.request.urlopen(req) html = response.read().decode('utf-8','ignore') a=html.find('current-comment-page')+23 b=html.find(']',a) return html[a:b] def find_url(page_url): req = urllib.request.Request(page_url) req.add_header('User-Agent','Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36') response = urllib.request.urlopen(req) html = response.read().decode('utf-8','ignore') url_list=[] a=html.find('img src=') while a != -1: b=html.find('.jpg',a,a+255) if b != -1: url_list.append(html[a+9:b+4]) else : b=a+9 a=html.find('img src=',b) return url_list def save_pic(folder,list): for each in list : filename=each.split('/')[-1] with open(filename,'wb') as f: each='http:'+each req = urllib.request.Request(each) req.add_header('User-Agent','Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36') response = urllib.request.urlopen(req) html = response.read() f.write(html) def download(file='demo',page=10): os.mkdir(file) os.chdir(file) url='http://jandan.net/ooxx/' page_num=int(get_url(url)) for i in range(page): page_num -= i page_url=url+'page-'+str(page_num) +'#comments' list=find_url(page_url) save_pic(file,list) if __name__=='__main__': download()
相关文章推荐
- 基于python的简单爬虫
- 基于python:opencv简单图片操作
- python简单的图片下载小爬虫
- python学习笔记(12)--爬虫下载煎蛋网图片
- python爬虫简单的抓页面图片并保存到本地
- 简单的抓取淘宝图片的Python爬虫
- 每天一篇python:简单爬虫下载图片篇
- Python简单知乎爬虫--爬取页面的图片并下载到本地
- 自己用python捣鼓的煎蛋网图片爬虫
- python 简单爬虫下载图片
- Python基于YCbCr 肤色模型的情色图片检测的简单实现
- 简单的抓取淘宝关键字信息、图片的Python爬虫|Python3中级玩家:淘宝天猫商品搜索爬虫自动化工具(第二篇)
- python3简单爬虫 (爬取各个网站上的图片)
- Python简单爬虫,爬取网页图片
- 简单的Python抓taobao图片爬虫
- 简单的python爬虫抓取图片实例
- python实现简单爬虫抓取图片
- Python爬虫_简单获取百度贴吧图片
- python实现简单爬虫--爬图片
- python简单小爬虫爬取易车网图片