您的位置：首页 > Web前端

Caffe —— Deep learning in Practice

2016-02-20 15:37 453 查看

因工作交接须要。要将caffe用法及总体结构描写叙述清楚。

鉴于也有同学问过我相关内容，决定在本文中写个简单的tutorial, 方便大家參考。

本文简单的讲几个事情：

Caffe能做什么？

为什么选择caffe?

环境

总体结构

Protocol buffer

训练基本流程

Python中训练

Debug

Caffe能做什么？

定义网络结构

训练网络

C++/CUDA 写的结构

cmd/python/Matlab接口

CPU/GPU工作模式

给了一些參考模型&pretrain了的weights

为什么选择caffe?

模块化做的好

简单：改动结构无需该代码

开源：共同维护开源码

环境：

$ lsb_release -a

Distributor ID: Ubuntu

Description: Ubuntu 12.04.4 LTS

Release: 12.04

Codename: precise

$ cat /proc/version

Linux version 3.2.0-29-generic (buildd@allspice) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012

Vim + Taglist + Cscope

总体结构：

定义CAFFE为caffe跟文件夹，caffe的核心代码都在$CAFFE/src/caffe 下，主要有下面部分：net, blob, layer, solver.

net.cpp:

net定义网络。整个网络中含有非常多layers， net.cpp负责计算整个网络在训练中的forward, backward过程，即计算forward/backward 时各layer的gradient。

layers:

在$CAFFE/src/caffe/layers中的层。在protobuffer (.proto文件里定义message类型。.prototxt或.binaryproto文件里定义message的值) 中调用时包括属性name， type（data/conv/pool…）。 connection structure (input blobs and output blobs)，layer-specific parameters（如conv层的kernel大小）。定义一个layer须要定义其setup, forward 和backward过程。

blob.cpp:

net中的数据和求导结果通过4维的blob传递。一个layer有非常多blobs， e.g,

对data，weight blob大小为Number * Channels * Height * Width, 如256*3*224*224。

对conv层。weight blob大小为 Output 节点数 * Input 节点数 * Height * Width，如AlexNet第一个conv层的blob大小为96 x 3 x 11 x 11。

对inner product 层， weight blob大小为 1 * 1 * Output节点数 * Input节点数； bias blob大小为1 * 1 * 1 * Output节点数（ conv层和inner product层一样。也有weight和bias，所以在网络结构定义中我们会看到两个blobs_lr，第一个是weights的。第二个是bias的。相似地。weight_decay也有两个，一个是weight的，一个是bias的）；

blob中。mutable_cpu/gpu_data() 和cpu/gpu_data()用来管理memory。cpu/gpu_diff()和 mutable_cpu/gpu_diff()用来计算求导结果。

slover.cpp:

结合loss。用gradient更新weights。

主要函数：

Init(),

Solve(),

ComputeUpdateValue(),

Snapshot(), Restore(),//快照（拷贝）与恢复网络state

Test()。

在solver.cpp中有3中solver。即3个类：AdaGradSolver, SGDSolver和NesterovSolver可供选择。

关于loss。能够同一时候有多个loss。能够加regularization（L1/L2）；

Protocol buffer：

上面已经将过。 protocol buffer在 .proto文件里定义message类型，.prototxt或.binaryproto文件里定义message的值；

Caffe

Caffe的全部message定义在$CAFFE/src/caffe/proto/caffe.proto中。

Experiment

在实验中，主要用到两个protocol buffer: solver的和model的，分别定义solver參数（学习率啥的）和model结构(网络结构)。

技巧：

冻结一层不參与训练：设置其blobs_lr=0

对于图像。读取数据尽量别用HDF5Layer（由于仅仅能存float32和float64，不能用uint8, 所以太费空间）

训练基本流程：

数据处理

法一，转换成caffe接受的格式：lmdb, leveldb, hdf5 / .mat, list of images, etc.；法二。自己写数据读取层(如https://github.com/tnarihi/tnarihi-caffe-helper/blob/master/python/caffe_helper/layers/data_layers.py)

定义网络结构

配置Solver參数

训练：如 caffe train -solver solver.prototxt -gpu 0

在python中训练:

Document & Examples: https://github.com/BVLC/caffe/pull/1733

核心code：

$CAFFE/python/caffe/_caffe.cpp

定义Blob, Layer, Net, Solver类

$CAFFE/python/caffe/pycaffe.py

Net类的增强功能

Debug：

在Make.config中设置DEBUG := 1

在solver.prototxt中设置debug_info: true

在python/Matlab中察看forward & backward一轮后weights的变化

经典文献：

[ DeCAF ] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. ICML, 2014.

[ R-CNN ] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR, 2014.

[ Zeiler-Fergus Visualizing] M. Zeiler and R. Fergus. visualizing and understanding convolutional networks. ECCV, 2014.

[ LeNet ] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. IEEE, 1998.

[ AlexNet ] A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. NIPS, 2012.

[ OverFeat ] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. Overfeat: Integrated recognition, localization and detection using convolutional networks. ICLR, 2014.

[ Image-Style (Transfer learning) ] S. Karayev, M. Trentacoste, H. Han, A. Agarwala, T. Darrell, A. Hertzmann, H. Winnemoeller. Recognizing Image Style. BMVC, 2014.

[ Karpathy14 ] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. CVPR, 2014.

[ Sutskever13 ] I. Sutskever. Training Recurrent Neural Networks. PhD thesis, University of Toronto, 2013.

[ Chopra05 ] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. CVPR, 2005.

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航