您的位置：首页 > 编程语言 > Python开发

Tornado1.0源码分析-Web Framework

2015-03-07 20:37 453 查看

摘要: Tornado1.0源码分析系列-Web Framework

#Web Framework

作者：MetalBug
时间：2015-03-02
出处：http://my.oschina.net/u/247728/blog
声明：版权所有，侵犯必究

tornado.web

—

RequestHandler

and

Application

classes

Tornado

的Web程序将URL或者URL范式映射到

RequestHandler

的子类。在其子类中定义了

get()

或者

post()

等函数，用于处理不同的HTTP请求。

以下是示例：

class MainHandler(tornado.web.RequestHandler):
def get(self):
self.write("You requested the main page")

application = web.Application([(r"/", MainPageHandler),])
http_server = httpserver.HTTPServer(application)
http_server.listen(8080)
ioloop.IOLoop.instance().start()

MainHandler

继承于

RequestHandler

，重写了

get()

函数，在

Application

中将其映射到URL:

,所以当我们以get方式访问host:

时会等到返回字符串"You requested the main page"。

1.Application##

Application

包含了URL与其对于那个的handler(继承自

RequestHandler

)，内部定义了

__call__

，所以可将其作为requset_callback传递给

HTTPServer

，当客户端访问对应URL，对调用对应的handler。

###内部实现-数据结构###

self.transforms

用于对输出进行分块和压缩。

self.handlers

主机名路由路径列表,每个元素为(host, URLSpec objects)。

self.named_handlers

为name映射对应handler的字典，用于

reverse_url

时反向查找。

self.settings

为设置，可用设置

static_path

static_url_prefix

等信息。

###内部实现-主要函数###

Application._init_()

初始化

Application

，主要做了以下工作：
1 .初始化

self.transforms

，默认为

GZipContentEncoding

和

ChunkedTransferEncoding

。
2 .初始化

self.hanlders

，先设定静态文件路由，再将添加路由规则。
3 .如果设置运行模式为Debug，则启用

autoreload

。

def __init__(self, handlers=None, default_host="", transforms=None,
wsgi=False, **settings):
if transforms is None:
self.transforms = []
if settings.get("gzip"):
self.transforms.append(GZipContentEncoding)
self.transforms.append(ChunkedTransferEncoding)
else:
self.transforms = transforms
######
if self.settings.get("static_path"):
path = self.settings["static_path"]
handlers = list(handlers or [])
static_url_prefix = settings.get("static_url_prefix",
"/static/")
handlers = [
(re.escape(static_url_prefix) + r"(.*)", StaticFileHandler,
dict(path=path)),
(r"/(favicon\.ico)", StaticFileHandler, dict(path=path)),
(r"/(robots\.txt)", StaticFileHandler, dict(path=path)),
] + handlers
if handlers: self.add_handlers(".*$", handlers)
####
if self.settings.get("debug") and not wsgi:
import autoreload
autoreload.start()

Application.add_handler()

Application.add_handler()

往self.handlers中添加路由路径规则。

self.handlers

为主机名路由路径列表，每个元素为tuple，包含了主机名和路由路径列表(URLSpec)。

Application.add_handler()

先将host_pattern(主机名)和handlers(路由路径列表)合成一个tuple，然后添加到

self.handles

中。

def add_handlers(self, host_pattern, host_handlers):
####
if self.handlers and self.handlers[-1][0].pattern == '.*$':
self.handlers.insert(-1, (re.compile(host_pattern), handlers))
else:
self.handlers.append((re.compile(host_pattern), handlers))

for spec in host_handlers:
if spec.name:
####
self.named_handlers[spec.name] = spec

Application.[b]call()[/b]

Application

定义了

__call()__

，使其实例能够被调用，作为

HTTPServer

的

requset_callback

。
该函数执行流程为：
1 .使用

request

初始化

self.transforms

，

self.transforms

将会对发送数据进行分块和压缩。
2 .根据

request

的host得到路由路径列表，使用

request.path

依次匹配路由路径列表的每一个对象，得到对应handler，同时解析得到路径中的参数(

match.group()

)。
3 .匹配得到的handler是

RequestHandler

对象，调用其

_execute()

方法，它的作用是根据不同的HTTP方法调用不同的对应函数。

def __call__(self, request):
transforms = [t(request) for t in self.transforms]
####
handlers = self._get_host_handlers(request)
####
for spec in handlers:
match = spec.regex.match(request.path)
if match:
handler = spec.handler_class(self, request, **spec.kwargs)
kwargs=dict((k, unquote(v)) for (k, v) in match.groupdict().iteritems())
args=[unquote(s) for s in match.groups()]
break
if not handler:
handler = ErrorHandler(self, request, 404)
####
handler._execute(transforms, *args, **kwargs)
return handler

###内部实现-内部细节###

在

Application

的初始化时候，调用了

add_handlers(".*$", handlers)

这里将

.*

作为默认主机名，因为

.*

能够匹配任意字符，所以默认情况下，传入的路由路径列表即为默认路由路径列表。

因为.*能够匹配任意字符，所以在

Application.add_handlers()

中需要保证它被放置在列表的最后。

Application

为什么定义

__call__()

以下是

__call__()

,其与C++的

functor

类似，主要用在涉及需要保存内部状态的情况下。

__call__(self, [args...])

Allows an instance of a class to be called as a function. Essentially, this means that x() is the same as

x.__call__()

. Note that

__call__

takes a variable number of arguments;
this means that you define

__call__

as you would any other function, taking however many arguments you'd like it to.

__call__

can be particularly useful in classes whose instances that need to often change state.

但对于当前的

Application

，在这里其实并没有特殊的作用，使用

self.callback

也可以。

2.RequestHandler##

在

Application.__call__()

，

RequestHandler

将

__execute()

暴露给

Application

，在这个函数中，实现了对HTTP请求的具体的分发和处理。
在实际使用时，我们继承

RequestHandler

并重写

get()

或

post()

等实现对HTTP请求的处理。

###内部实现-数据结构###

self.request

表示

RequestHandler

需要处理的请求(

HTTPRquest

)。

self._auto_finish

用于处理异步情况。

###内部实现-主要函数###
RequestHandler._execute()
在

RequestHandler._execute()

中，会根据HTTP请求的方法调用相对应的函数进行处理。
主要流程如下：
1 .如果为POST请求，同时设置了

xsrf

检查，那么先校验

xsrf

。
2 .调用

self.prepare()

,该函数为子类重写，做处理请求前的准备。
3 .根据HTTP请求方法调用对应处理函数。
4 .如果为

self._auto_finish

为

True

，那么执行

self.finish()

结束请求。

def _execute(self, transforms, *args, **kwargs):
self._transforms = transforms
try:
if self.request.method not in self.SUPPORTED_METHODS:
raise HTTPError(405)
if self.request.method == "POST" and \
self.application.settings.get("xsrf_cookies"):
self.check_xsrf_cookie()
self.prepare()
if not self._finished:
getattr(self, self.request.method.lower())(*args, **kwargs)
if self._auto_finish and not self._finished:
self.finish()
except Exception, e:
self._handle_request_exception(e)

Requesthandler.finish()

Requesthandler.finish()

用于业务逻辑代码执行后的处理工作。
主要完成了以下善后工作：
1 .设置返回请求的头部。
2 .调用

self.flush()

函数将缓冲区通过

IOStream

输出。
3 .关闭连接。

def finish(self, chunk=None):
if chunk is not None: self.write(chunk)
if not self._headers_written:
####set_header
if hasattr(self.request, "connection"):
self.request.connection.stream.set_close_callback(None)
if not self.application._wsgi:
self.flush(include_footers=True)
self.request.finish()
self._log()
self._finished = True

Requesthandler.flush()

Requesthandler.flush()

先将缓冲区中数据使用

transform

进行分块和压缩，再发送到客户端。

def flush(self, include_footers=False):
if self.application._wsgi:
raise Exception("WSGI applications do not support flush()")
chunk = "".join(self._write_buffer)
self._write_buffer = []
if not self._headers_written:
self._headers_written = True
for transform in self._transforms:
self._headers, chunk = transform.transform_first_chunk(
self._headers, chunk, include_footers)
headers = self._generate_headers()
else:
for transform in self._transforms:
chunk = transform.transform_chunk(chunk, include_footers)
headers = ""

if self.request.method == "HEAD":
if headers: self.request.write(headers)
return

if headers or chunk:
self.request.write(headers + chunk)

###内部实现-内部细节###

在

RequestHadlers.finish()

中，会将

self.request.connection.stream.close_callback

(下称

close_callback

)设置为None。
因为request已经结束，清除

close_callback

能够避免出现

RequestHandle

回收不及时情况。
如果不清除，假设request为长连接，当一次请求结束，这时候

RequestHandler

会因为

close_back

仍然绑定在request上而不会被回收。

def finish(self, chunk=None):
####
if hasattr(self.request, "connection"):
# Now that the request is finished, clear the callback we
# set on the IOStream (which would otherwise prevent the
# garbage collection of the RequestHandler when there
# are keepalive connections)
self.request.connection.stream.set_close_callback(None)
if not self.application._wsgi:
self.flush(include_footers=True)
self.request.finish()
self._log()
self._finished = True

上述代码中，先将

close_callback

设置为None，再调用

request.finish()

，根据之前对HTTPRequest和IOStream分析，在

request.finish()

中因为

_close_callback

已被设置为None,并不会被调用，这是为什么呢。

其实在这里，我们要注意的是

RequestHandler.on_connection_close()

跟

IOstream.on_close_callback()

意义并不一致。

在

RequestHandler

中，使用情景是当检测到客户端断开连接时使用，在异步调用时会被调用，可以做一些错误处理等工作。

def on_connection_close(self):
"""Called in async handlers if the client closed the connection.

You may override this to clean up resources associated with
long-lived connections.

Note that the select()-based implementation of IOLoop does not detect
closed connections and so this method will not be called until
you try (and fail) to produce some output.  The epoll- and kqueue-
based implementations should detect closed connections even while
the request is idle.
"""
pass

在

IOStream

中，

self._close_callback

在

IOStream.close()

时被调用，也就是在

Request.finish()

时被调用。

def set_close_callback(self, callback):
"""Call the given callback when the stream is closed."""
self._close_callback = callback

#总结

根据对

Application

和

RequestHandler

的分析，我们可以了解到

Tornado1.0

的Web框架对于一个请求的处理流程如下：

1 .Web程序为每一个请求创建一个

RequestHandler

对象并且初始化。
2 .Web程序调用

RequestHandler.prepare()

。无论使用了哪种HTTP方法，

RequestHandler.prepare()

都会被调用到，这个方法在子类中重写。
3 .Web程序根据HTTP方法调用对应处理函数：例如

get()

、

post()

、

put()

等。如果URL的正则表达式模式中有分组匹配，那么相关匹配会作为参数传入方法
。

当然我们也可以看到，在

Tronado1.0

中，对于

RequestHandler

的设计还是有不足的，例如上文讲到的

close_callback

意义问题，例如可以重写

prepare()

用于处理前的准备，为什么不能在

finish()

在添加调用

on_finish()

,用于自己增添的善后工作？这些都是有待完善的，具体的可以看

Tornado

后序版本的处理。

PS:博主自己对于Web这块了解比较薄弱，哪里说错请各位多多指正，谢谢。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： Tornado python

相关文章推荐

新的分享

章节导航