python gzip get url
2010-08-23 19:34
417 查看
import urllib2, gzip, StringIO __author__ = "Mark Pilgrim (mark@diveintomark.org)" __license__ = "Python" def get(uri): request = urllib2.Request(uri) request.add_header("Accept-encoding", "gzip") usock = urllib2.urlopen(request) data = usock.read() if usock.headers.get('content-encoding', None) == 'gzip': data = gzip.GzipFile(fileobj=StringIO.StringIO(data)).read() return data if __name__ == '__main__': import sys uri = sys.argv[1:] and sys.argv[1] or 'http://leknor.com/' print get(uri)
<div> <div><h3 class="title">Example 11.12. Using the redirect handler to detect permanent redirects</h3> http://diveintopython.org/http_web_services/redirects.html <h2 class="title"><a name="oa.gzip"></a>11.8. Handling compressed data</h2></div></div>
http://diveintopython.org/http_web_services/gzip_compression.html<br />
<br />
http://rationalpie.wordpress.com/2010/06/02/python-streaming-gzip-decompression/<br />
<br />
<br />
<br />
<div class="primary"> <h1>Python mechanize gzip response handling</h1> <p>Mechanize is awesome. The documentation is shit. The gzip support is non-existent. Some sites like Yahoo! require gzip support.</p> <pre>def ungzipResponse(r,b): headers = r.info() if headers['Content-Encoding']=='gzip': import gzip gz = gzip.GzipFile(fileobj=r, mode='rb') html = gz.read() gz.close() headers["Content-type"] = "text/html; charset=utf-8" r.set_data( html ) b.set_response(r) b = Browser() b.addheaders.append( ['Accept-Encoding','gzip'] ) r = b.open('http://some-gzipped-site.com') ungzipResponse(r,b) print r.read()http://unformatt.com/news/python-mechanize-gzip-response-handling/
http://news.ycombinator.com/item?id=1424488good articlehttp://betterexplained.com/articles/how-to-optimize-your-site-with-gzip-compression/
相关文章推荐
- Python3.6通过自带的urllib通过get或post方法请求url的实例
- python---get/post请求下载指定URL返回的网页内容,出现gzip乱码处理。设置Accept-Encoding为gzip,deflate,返回的网页是乱码
- Python中Http的GET或POST请求支持Gzip压缩
- python web编程---geturl(),urlunparse()
- python模拟Get请求保存网易歌曲的url
- 错误:Python Urlfetch Error:'GET
- python模拟Get请求保存网易歌曲的url
- python模拟Get请求保存网易歌曲的url
- Python网络爬虫(五)-----geturl及info
- python urllib2.urlopen的geturl方法
- python3爬虫requests.get(url)出现http 500错误
- Python3.6通过自带的urllib通过get或post方法请求url
- 给Python中通过urlopen/urlretrieve获取网页的过程中,添加gzip的解压缩支持
- Get parameters from Url
- python的gzip库使用方法
- HTTP method GET is not supported by this URL
- ubuntu下 离线安装apt-get 、python包
- 关于ajax get方式请求 url地址参数怎么变成空了的问题
- [Python]网络爬虫(一):抓取网页的含义和URL基本构成
- http urlconnection getcontentlength总返回值为-1