您的位置：首页 > 其它

TensorFlow中的高级API——Estimator

2018-03-29 14:35 309 查看

评估器的优势

预置评估器(Pre-made Estimators)
基于预置评估器的程序结构

预置评估器的优点：

自定义评估器

推荐工作流程

原文：

https://www.tensorflow.org/programmers_guide/estimators

Estimator 封装了以下部分：

训练

评估

预测

导出用于服务

评估器的优势

基于Estimator的模型可以运行于本地，也可以运行在分布式环境中。更可以运行于CPU、GPU和TPU之上而无需修改代码

方便了模型开发者间的分享

创建模型比使用低等级API更容易

Estimator自身建立在tf.layers,简化定制

Estimator自动建立图表

提供了安全的分布式训练循环来决定怎样以及何时：

建立图表

初始化变量

开始队列

处理异常

创建检查点文件并从故障中恢复

保存TensorBoard的摘要

使用Estimators编写应用程序时，必须将数据输入管道与模型分开。这种分离简化了不同数据集的实验。

预置评估器(Pre-made Estimators)

预置评估器创建和管理Graph和Session对象，只需进行最少量的代码更改即可尝试不同的模型架构。例如， DNNClassifier 是一个预置的Estimator类，它通过密集的前馈神经网络来训练分类模型。

基于预置评估器的程序结构

写一个或多个数据集导入函数。例如训练集和测试集。每个数据集导入函数必须返回两个对象：

一个字典，其中的键是特征名称，值是包含相应特征数据的Tensors（或SparseTensors）

包含一个或多个标签的Tensor

例如，下面的代码演示了输入函数的基本框架：

def input_fn(dataset):
...  # manipulate dataset, extracting feature names and the label
return feature_dict, label

定义特征列。每个tf.feature_column识别一个特征的名称，类型和任何输入预处理。以下片段创建了三个保存整数或浮点数据的特征列。前两个特征列只是标识特征的名称和类型。第三个特性列还指定了程序将调用的用于缩放原始数据的lambda：

# Define three numeric feature columns.

population = tf.feature_column.numeric_column('population')
crime_rate = tf.feature_column.numeric_column('crime_rate')
median_education =    tf.feature_column.numeric_column('median_education',
normalizer_fn='lambda x: x - global_education_mean')

实例化相关的预置评估器。例如：

# Instantiate an estimator, passing the feature columns.

estimator = tf.estimator.Estimator.LinearClassifier(
feature_columns=[population, crime_rate, median_education],
)

调用训练、评估、预测方法。例如，所有评估器都提供训练方法：

# my_training_set is the function created in Step 1

estimator.train(input_fn=my_training_set, steps=2000)

预置评估器的优点：

Best practices for determining where different parts of the computational graph should run, implementing strategies on a single machine or on a cluster.

Best practices for event (summary) writing and universally useful summaries.

自定义评估器

每个评估器的核心——无论是预置还是自定义——都是其模型函数，它是一种为训练，评估和预测构建图形的方法。当使用预置评估器时，其他人已经实现了该模型功能。当使用自定义评估器时，须自己编写模型函数。