您的位置：首页 > 其它

如何开始 ‘深度学习’ 项目

2017-02-16 14:22 441 查看

如何开始 ‘深度学习’项目

这是原文链接：

http://www.svds.com/getting-started-deep-learning/

A REVIEW OF AVAILABLE TOOLS | FEBRUARY 15TH, 2017

这篇文章同样来自HN推荐，首先说说我的装备，从来没有接触过深度学习，或许在学校期间，学到的那么一点点图像识别匹配等，或者业余时，自己写写小游戏而揣摩AI应该如何工作，并被打败后如何学习等。所以，我也不懂什么叫深度学习，搜索了大量的深度学习框架和工具，一同分享给大家。

这篇文章主要讲述如果你有一个深度学习项目，那么如何开展这个项目，如何选择框架，如何选择语言，如何对后期进行训练和管理等。此文章算是一种范式的流程，让哪些不具有深度学习经验的人尝试去开始一项深度学习任务。

At SVDS, our R&D team has been investigating different deep learning technologies, from recognizing images of trains to speech recognition. We needed to build a pipeline for ingesting data, creating a model, and evaluating the model performance. However, when we researched what technologies were available, we could not find a concise summary document to reference for starting a new deep learning project.

简述

在SVDS，我们的研发（R&D就是研发的意思）团队一直在研究不同的深度学习技术，从图像识别训练（类似小度那样需要学习）到语音识别。我们需要构造一个流程，从数据接收，到模型创建，最终评估它的性能。然而，当我们研究可用的技术时，却找不到一个简明介绍此技术的文章来帮助我们开始一个新的深度学习项目。

One way to give back to the open source community that provides us with tools is to help others evaluate and choose those tools in a way that takes advantage of our experience. We offer the chart below, along with explanations of the various criteria upon which we based our decisions.

你可以将此问题回馈给开源社区，让社区来帮助你寻找此深度学习项目的工具和方法，那里大多数人都有很多经验。如下图表，这是基于我们的标准对所选框架及工具的一种评估。

These rankings are a combination of our subjective experiences with image and speech recognition applications for these technologies, as well as publicly available benchmarking studies. We explain our scoring below:

这些分数主要来源于我们使用这样一套体系，针对图像识别和语音识别研究时的一些主观感受，还包含公开过的基准研究。以下是我们的解释：

Languages: When getting started with deep learning, it is best to use a framework that supports a language you are familiar with. For instance, Caffe (C++) and Torch (Lua) have Python bindings for its codebase (with PyTorch being released in January 2017), but we would recommend that you are proficient with C++ or Lua respectively if you would like to use those technologies. In comparison, TensorFlow and MXNet have great multi language support that make it possible to utilize the technology even if you are not proficient with C++.

语言

如果开始一项深度学习，那最好使用一个支持你最熟悉的语言的框架。例如，Caffe（C ++）和Torch（Lua）的代码库可以和Python进行交互绑定迁移（PyTorch 于2017年1月发布），但是如果您想使用这些技术，我们建议您熟练使用C ++或Lua。相比之下，TensorFlow和MXNet可以支持多种编程语言，即使您不熟练使用C ++，也可以使用该框架。

Tutorials and Training Materials: Deep learning technologies vary dramatically in the quality and quantity of tutorials and getting started materials. Theano, TensorFlow, Torch, and MXNet have well documented tutorials that are easy to understand and implement. While Microsoft’s CNTK and Intel’s Nervana Neon are powerful tools, we struggled to find beginner-level materials. Additionally, we’ve found that the engagement of the GitHub community is a strong indicator of not only a tool’s future development, but also a measure of how likely/fast an issue or bug can be solved through searching StackOverflow or the repo’s Git Issues. It is important to note that TensorFlow is the 800-pound Gorilla in the room in regards to quantity of tutorials, training materials, and community of developers and users.

教程和培训

有关深度学习的资料在质量和数量上有很大的不同。Theano，TensorFlow，Torch和MXNet有很好的资料，并且容易理解和实践。虽然微软的CNTK和英特尔的Nervana Neon是强大的工具，但我们需要花费更多的时间去寻找教程。此外，我们在GitHub社区中发现到的参与某个工具的研究也会作为此项指标，并且还有通过StackOverflow或repo上发出的问题，回复问题的质量与速度来进一步衡量。重要的是，TensorFlow工具，有一帮超级牛逼的人们在同你一起学习，分享资料，教程等。

CNN Modeling Capability: Convolutional neural networks (CNNs) are used for image recognition, recommendation engines, and natural language processing. A CNN is composed of a set of distinct layers that transform the initial data volume into output scores of predefined class scores (For more information, check out Eugenio Culurciello’s overview of Neural Network architectures). CNN’s can also be used for regression analysis, such as models that output of steering angles in autonomous vehicles. We consider a technology’s CNN modeling capability to include several features. These features include the opportunity space to define models, the availability of prebuilt layers, and the tools and functions available to connect these layers. We’ve seen that Theano, Caffe, and MXNet all have great CNN modeling capabilities. That said, TensorFlow’s easy ability to build upon it’s InceptionV3 model and Torch’s great CNN resources including easy-to-use temporal convolution set these two technologies apart for CNN modeling capability.

卷积神经网络（CNN）

CNN在图像识别，推送引擎和自然语言中比较好。CNN由一系列不同的层组成，将初始的数据量转换为预定义类别的可用分数（有关更多信息，请参阅Eugenio Culurciello 对神经网络架构的概述）。CNN也可以用于回归分析，例如在自主车辆中输出转向角来建模。我们认为CNN建模能力包含许多特性。这些包括定义空间模型，预置层的可用性，以及可用于连接这些层的工具和功能。

RNN Modeling Capability: Recurrent neural networks (RNNs) are used for speech recognition, time series prediction, image captioning, and other tasks that require processing sequential information. As prebuilt RNN models are not as numerous as CNNs, it is therefore important if you have a RNN deep learning project that you consider what RNN models have been previously implemented and open sourced for a specific technology. For instance, Caffe has minimal RNN resources, while Microsoft’s CNTK and Torch have ample RNN tutorials and prebuilt models. While vanilla TensorFlow has some RNN materials, TFLearn and Keras include many more RNN examples that utilize TensorFlow.

循环神经网络（RNN）

RNN用于语音识别，时间序列模拟，栅格字和其他有处理逻辑的任务。由于预置的RNN模型不如CNN数量多，因此，如果您有一个RNN深度学习项目，那么考虑之前的RNN模型及有特定技术的RNN模型变得尤为重要。例如，Caffe有最小的RNN资源，而Microsoft的CNTK和Torch有丰富的RNN教程和预置模型。而vanilla的TensorFlow有一些RNN材料
c10b
，TFLearn和Keras则有使用TensorFlow的诸多示例。

Architecture: In order to create and train new models in a particular framework, it is critical to have an easy to use and modular front end. TensorFlow, Torch, and MXNet have a straightforward, modular architecture that makes development straightforward. In comparison, frameworks such as Caffe require significant amount of work to create a new layer. We’ve found that TensorFlow in particular is easy to debug and monitor during and after training, as the TensorBoard web GUI application is included.

架构

为了在特定框架中创建和训练新模型，那么具备易于使用和便于模块化则显得更加重要。TensorFlow，Torch和MXNet有一个明确的模块化架构，它使开发更简单。相比之下，许多Caffe的框架需要大量的工作才能创建一个新层。我们已经发现，TensorFlow特别容易在训练中和训练后进行调试和监视，因为它包含TensorBoard web GUI应用程序。

Speed:Torch and Nervana have the best documented performance for open source convolutional neural network benchmarking tests. TensorFlow performance was comparable for most tests, while Caffe and Theano lagged behind. Microsoft’s CNTK claims to have some of the fastest RNN training time. Another study comparing Theano, Torch, and TensorFlow directly for RNN showed that Theano performs the best of the three.

速度

Torch和Nervana具有开源卷积神经网络基准测试性能的最佳纪录。TensorFlow性能在大多数测试中是可比的，而Caffe和Theano稍显落后。微软声称CNTK具有最快的RNN训练时间。另一项比较是Theano，Torch和TensorFlow在RNN训练中，Theano表现最好。

Multiple GPU Support: Most deep learning applications require an outstanding number of floating point operations (FLOPs). For example, Baidu’s DeepSpeech recognition models take 10s of ExaFLOPs to train. That is >10e18 calculations! As leading Graphics Processing Units (GPUs) such as NVIDIA’s Pascal TitanX can execute 11e9 FLOPs a second, it would take over a week to train a new model on a sufficiently large dataset. In order to decrease the time it takes to build a model, multiple GPUs over multiple machines are needed. Luckily, most of the technologies outlined above offer this support. In particular, MXNet is reported to have one the most optimized multi-GPU engine.

多GPU支持

大多数深度学习应用程序需要解决大量的浮点运算（FLOP）。例如，百度的DeepSpeech识别模型需要在10s内完成ExaFLOP的训练。这是> 10e18计算量！由于诸如NVIDIA公司的Pascal TitanX等领先的图形处理单元（GPU）可以每秒执行11e9 浮点运算，因此在足够大的数据集上训练一个新模型需要一个星期。为了减少构建模型的时间，那需要在多个机器上使用多个GPU。幸运的是，上述大部分技术都提供这种支持。尤其是MXNet被认为是具有对多GPU支持的最优的引擎。

Keras Compatible: Keras is a high level library for doing fast deep learning prototyping. We’ve found that it is a great tool for getting data scientists comfortable with deep learning. Keras currently supports two back ends, TensorFlow and Theano, and will be gaining official support in TensorFlow in the future. Keras is also a good choice for a high-level library when considering that its author recently expressed that Keras will continue to exist as a front end that can be used with multiple back ends.

Keras兼容

Keras是做深度快速学习的具有高水平的框架。我们发现，这是一个让数据科学家也能深入学习的好工具。Keras目前支持两个后端，TensorFlow和Theano，并将在以后获得TensorFlow的官方支持。Keras也是高级工具的不错选择，考虑到其作者最近表示，Keras将继续作一个可以使用多种后端技术的前端存在着。

If you are interested in getting started with deep learning, I would recommend evaluating your own team’s skills and your project needs first. For instance, for an image recognition application with a Python-centric team we would recommend TensorFlow given its ample documentation, decent performance, and great prototyping tools. For scaling up an RNN to production with a Lua competent client team, we would recommend Torch for its superior speed and RNN modeling capabilities.

小结

如您有兴趣开始深度学习，我建议先评估您自己的团队能力和您的项目需求。例如，对于一个以Python为中心的团队开展图像识别应用程序开发，我们会推荐TensorFlow，它能给予更丰富的文档，不错的性能和伟大的原生工具。如果有一个具有Lua能力的团队进行RNN应用开发，我们建议使用Torch，它拥有更优的速度和具备RNN建模能力。

In the future we will discuss some of our challenges in scaling up our models. These challenges include optimizing GPU usage over multiple machines and adapting open source libraries like CMU Sphinx and Kaldi for our deep learning pipeline.

在将来，我们继续讨论扩大我们的学习模型来迎接更多的挑战。这些挑战包括优化多个机器上的GPU支持以及适应更多的开源库，如CMU Sphinx和Kaldi，将此用于我们的深度学习中。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： 深度学习 HackerNews

相关文章推荐

新的分享

章节导航