您的位置：首页 > 编程语言 > Go语言

dlib 10 dlib自带demo GoogLeNet (inception)

2017-09-20 17:04 597 查看

01 资源

代码:dlib\examples\dnn_inception_ex.cpp

工程名:dnn_inception_ex

训练用mnist数据文件:http://yann.lecun.com/exdb/mnist/

http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz

http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz

http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz

http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz

下载后，解压到dlib\data\mnist_data\

dlib\data\mnist_data\t10k-images.idx3-ubyte

dlib\data\mnist_data\t10k-labels.idx1-ubyte

dlib\data\mnist_data\train-images.idx3-ubyte

dlib\data\mnist_data\train-labels.idx1-ubyte

02 项目设置

把examples解决方案中的 dnn_inception_ex 工程设置为启动项。

配置属性==>调试==>命令参数==>..\..\..\data\mnist_data
配置属性==>调试==>工作目录==>$(OutDir)

03 运行结果

要运行结果，需要使用Release版本，并且直接用dnn_inception_ex.exe运行。Debug版本是相当慢的。

Release版本，CPU版本，大约需要24小时左右。

D:\git\dlib\build\x64_19.6_examples\Release>dnn_inception_ex.exe ..\..\..\data\mnist_data
The net has 43 layers in it.
layer<0>        loss_multiclass_log
layer<1>        fc       (num_outputs=10) learning_rate_mult=1 weight_decay_mult=1 bias_learning_rate_mult=1 bias_weight_decay_mult=0
layer<2>        relu
layer<3>        fc       (num_outputs=32) learning_rate_mult=1 weight_decay_mult=1 bias_learning_rate_mult=1 bias_weight_decay_mult=0
layer<4>        max_pool (nr=2, nc=2, stride_y=2, stride_x=2, padding_y=0, padding_x=0)
layer<5>        concat   (1001,1002,1003)
layer<6>        tag1001
layer<7>        relu
layer<8>        con      (num_filters=4, nr=1, nc=1, stride_y=1, stride_x=1, padding_y=0, padding_x=0) learning_rate_mult=1 weight_decay_mult=1 bias_learning_rate_mult=1 bias_weight_decay_mult=0
layer<9>        skip1000
layer<10>       tag1002
layer<11>       relu
layer<12>       con      (num_filters=4, nr=3, nc=3, stride_y=1, stride_x=1, padding_y=1, padding_x=1) learning_rate_mult=1 weight_decay_mult=1 bias_learning_rate_mult=1 bias_weight_decay_mult=0
layer<13>       skip1000
layer<14>       tag1003
layer<15>       relu
layer<16>       con      (num_filters=4, nr=1, nc=1, stride_y=1, stride_x=1, padding_y=0, padding_x=0) learning_rate_mult=1 weight_decay_mult=1 bias_learning_rate_mult=1 bias_weight_decay_mult=0
layer<17>       max_pool (nr=3, nc=3, stride_y=1, stride_x=1, padding_y=1, padding_x=1)
layer<18>       tag1000
layer<19>       max_pool (nr=2, nc=2, stride_y=2, stride_x=2, padding_y=0, padding_x=0)
layer<20>       concat   (1001,1002,1003,1004)
layer<21>       tag1001
layer<22>       relu
layer<23>       con      (num_filters=10, nr=1, nc=1, stride_y=1, stride_x=1, padding_y=0, padding_x=0) learning_rate_mult=1 weight_decay_mult=1 bias_learning_rate_mult=1 bias_weight_decay_mult=0
layer<24>       skip1000
layer<25>       tag1002
layer<26>       relu
layer<27>       con      (num_filters=10, nr=3, nc=3, stride_y=1, stride_x=1, padding_y=1, padding_x=1) learning_rate_mult=1 weight_decay_mult=1 bias_learning_rate_mult=1 bias_weight_decay_mult=0
layer<28>       relu
layer<29>       con      (num_filters=16, nr=1, nc=1, stride_y=1, stride_x=1, padding_y=0, padding_x=0) learning_rate_mult=1 weight_decay_mult=1 bias_learning_rate_mult=1 bias_weight_decay_mult=0
layer<30>       skip1000
layer<31>       tag1003
layer<32>       relu
layer<33>       con      (num_filters=10, nr=5, nc=5, stride_y=1, stride_x=1, padding_y=2, padding_x=2) learning_rate_mult=1 weight_decay_mult=1 bias_learning_rate_mult=1 bias_weight_decay_mult=0
layer<34>       relu
layer<35>       con      (num_filters=16, nr=1, nc=1, stride_y=1, stride_x=1, padding_y=0, padding_x=0) learning_rate_mult=1 weight_decay_mult=1 bias_learning_rate_mult=1 bias_weight_decay_mult=0
layer<36>       skip1000
layer<37>       tag1004
layer<38>       relu
layer<39>       con      (num_filters=10, nr=1, nc=1, stride_y=1, stride_x=1, padding_y=0, padding_x=0) learning_rate_mult=1 weight_decay_mult=1 bias_learning_rate_mult=1 bias_weight_decay_mult=0
layer<40>       max_pool (nr=3, nc=3, stride_y=1, stride_x=1, padding_y=1, padding_x=1)
layer<41>       tag1000
layer<42>       input<matrix>
epoch: 0.0149333  learning rate: 0.01  average loss: 2.29948      steps without apparent progress: 0
Saved state to inception_sync
epoch: 0.0277333  learning rate: 0.01  average loss: 2.27178      steps without apparent progress: 0
Saved state to inception_sync_
...
epoch: 59.8512  learning rate: 1e-05  average loss: 0.00492863   steps without apparent progress: 1993
Saved state to inception_sync_
Epoch: 60    learning rate: 1e-06  average loss: 0.00496986   steps without apparent progress: 0
Saved state to inception_sync_
...
Traning NN...
training num_right: 59957
training num_wrong: 43
training accuracy:  0.999283
testing num_right: 9905
testing num_wrong: 95
testing accuracy:  0.9905

# 在D:\git\dlib\build\x64_19.6_examples\Release下生成如下3个文件
mnist_network_inception.dat 105KB
inception_sync              224KB
inception_sync_             228KB

04 代码

dlib\examples\dnn_inception_ex.cpp

// The contents of this file are in the public domain. See LICENSE_FOR_EXAMPLE_PROGRAMS.txt
/*
This is an example illustrating the use of the deep learning tools from the
dlib C++ Library.  I'm assuming you have already read the introductory
dnn_introduction_ex.cpp and dnn_introduction2_ex.cpp examples.  In this
example we are going to show how to create inception networks.

An inception network is composed of inception blocks of the form:

input from SUBNET
/        |        \
/         |         \
block1    block2  ... blockN
\         |         /
\        |        /
concatenate tensors from blocks
|
output

That is, an inception block runs a number of smaller networks (e.g. block1,
block2) and then concatenates their results.  For further reading refer to:
Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
*/

#include <dlib/dnn.h>
#include <iostream>
#include <dlib/data_io.h>

using namespace std;
using namespace dlib;

// Inception layer has some different convolutions inside.  Here we define
// blocks as convolutions with different kernel size that we will use in
// inception layer block.
template <typename SUBNET> using block_a1 = relu<con<10,1,1,1,1,SUBNET>>;
template <typename SUBNET> using block_a2 = relu<con<10,3,3,1,1,relu<con<16,1,1,1,1,SUBNET>>>>;
template <typename SUBNET> using block_a3 = relu<con<10,5,5,1,1,relu<con<16,1,1,1,1,SUBNET>>>>;
template <typename SUBNET> using block_a4 = relu<con<10,1,1,1,1,max_pool<3,3,1,1,SUBNET>>>;

// Here is inception layer definition. It uses different blocks to process input
// and returns combined output.  Dlib includes a number of these inceptionN
// layer types which are themselves created using concat layers.
template <typename SUBNET> using incept_a = inception4<block_a1,block_a2,block_a3,block_a4, SUBNET>;

// Network can have inception layers of different structure.  It will work
// properly so long as all the sub-blocks inside a particular inception block
// output tensors with the same number of rows and columns.
template <typename SUBNET> using block_b1 = relu<con<4,1,1,1,1,SUBNET>>;
template <typename SUBNET> using block_b2 = relu<con<4,3,3,1,1,SUBNET>>;
template <typename SUBNET> using block_b3 = relu<con<4,1,1,1,1,max_pool<3,3,1,1,SUBNET>>>;
template <typename SUBNET> using incept_b = inception3<block_b1,block_b2,block_b3,SUBNET>;

// Now we can define a simple network for classifying MNIST digits.  We will
// train and test this network in the code below.
using net_type = loss_multiclass_log<
fc<10,
relu<fc<32,
max_pool<2,2,2,2,incept_b<
max_pool<2,2,2,2,incept_a<
input<matrix<unsigned char>>
>>>>>>>>;

int main(int argc, char** argv) try
{
// This example is going to run on the MNIST dataset.
if (argc != 2)
{
cout << "This example needs the MNIST dataset to run!" << endl;
cout << "You can get MNIST from http://yann.lecun.com/exdb/mnist/" << endl;
cout << "Download the 4 files that comprise the dataset, decompress them, and" << endl;
cout << "put them in a folder.  Then give that folder as input to this program." << endl;
return 1;
}

std::vector<matrix<unsigned char>> training_images;
std::vector<unsigned long>         training_labels;
std::vector<matrix<unsigned char>> testing_images;
std::vector<unsigned long>         testing_labels;
load_mnist_dataset(argv[1], training_images, training_labels, testing_images, testing_labels);

// Make an instance of our inception network.
net_type net;
cout << "The net has " << net.num_layers << " layers in it." << endl;
cout << net << endl;

cout << "Traning NN..." << endl;
dnn_trainer<net_type> trainer(net);
trainer.set_learning_rate(0.01);
trainer.set_min_learning_rate(0.00001);
trainer.set_mini_batch_size(128);
trainer.be_verbose();
trainer.set_synchronization_file("inception_sync", std::chrono::seconds(20));
// Train the network.  This might take a few minutes...
trainer.train(training_images, training_labels);

// At this point our net object should have learned how to classify MNIST images.  But
// before we try it out let's save it to disk.  Note that, since the trainer has been
// running images through the network, net will have a bunch of state in it related to
// the last batch of images it processed (e.g. outputs from each layer).  Since we
// don't care about saving that kind of stuff to disk we can tell the network to forget
// about that kind of transient data so that our file will be smaller.  We do this by
// "cleaning" the network before saving it.
net.clean();
serialize("mnist_network_inception.dat") << net;
// Now if we later wanted to recall the network from disk we can simply say:
// deserialize("mnist_network_inception.dat") >> net;

// Now let's run the training images through the network.  This statement runs all the
// images through it and asks the loss layer to convert the network's raw output into
// labels.  In our case, these labels are the numbers between 0 and 9.
std::vector<unsigned long> predicted_labels = net(training_images);
int num_right = 0;
int num_wrong = 0;
// And then let's see if it classified them correctly.
for (size_t i = 0; i < training_images.size(); ++i)
{
if (predicted_labels[i] == training_labels[i])
++num_right;
else
++num_wrong;

}
cout << "training num_right: " << num_right << endl;
cout << "training num_wrong: " << num_wrong << endl;
cout << "training accuracy:  " << num_right/(double)(num_right+num_wrong) << endl;

// Let's also see if the network can correctly classify the testing images.
// Since MNIST is an easy dataset, we should see 99% accuracy.
predicted_labels = net(testing_images);
num_right = 0;
num_wrong = 0;
for (size_t i = 0; i < testing_images.size(); ++i)
{
if (predicted_labels[i] == testing_labels[i])
++num_right;
else
++num_wrong;

}
cout << "testing num_right: " << num_right << endl;
cout << "testing num_wrong: " << num_wrong << endl;
cout << "testing accuracy:  " << num_right/(double)(num_right+num_wrong) << endl;

}
catch(std::exception& e)
{
cout << e.what() << endl;
}

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航