mxnet深度学习实战:跑自己的数据实验和一些问题总结
2016-09-19 15:06
465 查看
用mxnet跑自己的数据
0 代码编译
git clone https://github.com/dmlc/mxnet.git
git clone https://github.com/dmlc/mshadow.git
git clone https://github.com/dmlc/dmlc-core.git
git clone https://github.com/dmlc/ps-lite.git
make -j4
1 数据准备
参考 http://blog.csdn.net/a350203223/article/details/50263737 把数据转换成 REC 模式。
备注: make_list.py 可以自动生成 train 和 val 的 lst文件。 可使用参数 --train_ratio=XXX
2 跑数据
参考mxnet/example/image-classification里面train_cifar10.py 和 symbol_inception-bn-28-small.py
symbol文件主要用来保存网络结构
一个简单的3层CNN网络
symbol_UCM.py
[python]
view plain
copy
import find_mxnet
import mxnet as mx
def get_symbol(num_classes = 21):
data = mx.symbol.Variable('data')
# first conv
conv1 = mx.symbol.Convolution(data=data, kernel=(3,3), num_filter=128)
bn1 = mx.symbol.BatchNorm(data=conv1)
relu1 = mx.symbol.Activation(data=bn1, act_type="relu")
pool1 = mx.symbol.Pooling(data=relu1, pool_type="max",
kernel=(5,5), stride=(3,3))
# second conv
conv2 = mx.symbol.Convolution(data=pool1, kernel=(3,3), num_filter=196)
bn2 = mx.symbol.BatchNorm(data=conv2)
relu2 = mx.symbol.Activation(data=bn2, act_type="relu")
pool2 = mx.symbol.Pooling(data=relu2, pool_type="max",
kernel=(3,3), stride=(2,2))
# second conv
conv3 = mx.symbol.Convolution(data=pool2, kernel=(3,3), num_filter=196)
bn3 = mx.symbol.BatchNorm(data=conv3)
relu3 = mx.symbol.Activation(data=bn3, act_type="relu")
pool3 = mx.symbol.Pooling(data=relu3, pool_type="max",
kernel=(2,2), stride=(2,2), name="final_pool")
# first fullc
flatten = mx.symbol.Flatten(data=pool3)
fc1 = mx.symbol.FullyConnected(data=flatten, num_hidden=420)
relu4 = mx.symbol.Activation(data=fc1, act_type="relu")
# second fullc
fc2 = mx.symbol.FullyConnected(data=relu4, num_hidden=num_classes)
# loss
softmax = mx.symbol.SoftmaxOutput(data=fc2, name='softmax')
return softmax
train_UCM.py
[python]
view plain
copy
import find_mxnet
import mxnet as mx
import argparse
import os, sys
import train_model
parser = argparse.ArgumentParser(description='train an image classifer on UCMnet')
parser.add_argument('--network', type=str, default='UCM_128_BN3layer',
help = 'the cnn to use')
parser.add_argument('--data-dir', type=str, default='/home/panda/Ureserch/data/Scene/UCM/',
help='the input data directory')
parser.add_argument('--gpus', type=str, default='0',
help='the gpus will be used, e.g "0,1,2,3"')
parser.add_argument('--num-examples', type=int, default=1680,
help='the number of training examples')
parser.add_argument('--batch-size', type=int, default=64,
help='the batch size')
parser.add_argument('--lr', type=float, default=.01,
help='the initial learning rate')
parser.add_argument('--lr-factor', type=float, default=.94,
help='times the lr with a factor for every lr-factor-epoch epoch')
parser.add_argument('--lr-factor-epoch', type=float, default=5,
help='the number of epoch to factor the lr, could be .5')
parser.add_argument('--model-prefix', type=str,
help='the prefix of the model to load/save')
parser.add_argument('--num-epochs', type=int, default=80,
help='the number of training epochs')
parser.add_argument('--load-epoch', type=int,
help="load the model on an epoch using the model-prefix")
parser.add_argument('--kv-store', type=str, default='local',
help='the kvstore type')
# 存放训练信息,用来画 training curve
parser.add_argument('--log-file', type=str,default="xxx",
help='the name of log file')
parser.add_argument('--log-dir', type=str, default="/xxx/xxx/xxx/",
help='directory of the log file')
args = parser.parse_args()
# network
import importlib
net = importlib.import_module('symbol_' + args.network).get_symbol(21)
# data 如果没有 image-mean , 会自动计算,存放于 args.data_dir + "xxx.bin"
def get_iterator(args, kv):
data_shape = (3, 109, 109)
train = mx.io.ImageRecordIter(
path_imgrec = args.data_dir + "xxx.rec",
mean_img = args.data_dir + "xxx.bin",
data_shape = data_shape,
batch_size = args.batch_size,
rand_crop = True,
rand_mirror = True,
num_parts = kv.num_workers,
part_index = kv.rank)
val = mx.io.ImageRecordIter(
path_imgrec = args.data_dir + "xxxrec",
mean_img = args.data_dir + "xxx.bin",
rand_crop = False,
rand_mirror = False,
data_shape = data_shape,
batch_size = args.batch_size,
num_parts = kv.num_workers,
part_index = kv.rank)
return (train, val)
# train
train_model.fit(args, net, get_iterator)
![](https://img-blog.csdn.net/20160318154158172?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center)
3. 利用 log 画 training和val曲线
需用到matplotlib,提前安装
[python]
view plain
copy
import matplotlib.pyplot as plt
import numpy as np
import re
import argparse
parser = argparse.ArgumentParser(description='Parses log file and generates train/val curves')
parser.add_argument('--log-file', type=str,default="/home/panda/Ureserch/mxnet_panda/UCM_EXP/UCM_128_log_4",
help='the path of log file')
args = parser.parse_args()
TR_RE = re.compile('.*?]\sTrain-accuracy=([\d\.]+)')
VA_RE = re.compile('.*?]\sValidation-accuracy=([\d\.]+)')
log = open(args.log_file).read()
log_tr = [float(x) for x in TR_RE.findall(log)]
log_va = [float(x) for x in VA_RE.findall(log)]
idx = np.arange(len(log_tr))
plt.figure(figsize=(8, 6))
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.plot(idx, log_tr, 'o', linestyle='-', color="r",
label="Train accuracy")
plt.plot(idx, log_va, 'o', linestyle='-', color="b",
label="Validation accuracy")
plt.legend(loc="best")
plt.xticks(np.arange(min(idx), max(idx)+1, 5))
plt.yticks(np.arange(0, 1, 0.2))
plt.ylim([0,1])
plt.show()
![](https://img-blog.csdn.net/20160318154820352?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center)
4. 保存训练好的模型
在 train_model.py 加入如下代码,训练完成后保存
[python]
view plain
copy
prefix = 'UCM_MODEL'
iteration = args.num_epochs
model.save(prefix, iteration)
5. 利用保存的模型进行predict
predict_UCM.py
[python]
view plain
copy
import find_mxnet
import mxnet as mx
import logging
import argparse
import os, sys
import train_model
import numpy as np
# 这里用的 mxnet 的 imanet训练的 Inception模型, 其他模型同理
prefix = '/home/panda/Ureserch/mxnet_panda/inception-21k model/Inception'
iteration = 9
model_load = mx.model.FeedForward.load(prefix, iteration)
data_shape = (3, 224, 224)
# 数据准备 batch_size = 1.
val = mx.io.ImageRecordIter(
path_imgrec = '/xxx/xxx/' + "xxx.rec",
mean_img = '/xxx/xxx/' + "xxx.bin",
rand_crop = False,
rand_mirror = False,
data_shape = data_shape,
batch_size = 1)
[prob, data1, label1] = model_load.predict(val, return_data=True)
6 利用 pretrain模型提取任意层特征
feature_extraction.py
模型和数据准备如 step.5, 还是
[python]
view plain
copy
internals = model_load.symbol.get_internals()
# 记住要提取特征的那一层的名字。 我这是 flatten 。
fea_symbol = internals["flatten_output"]
feature_extractor = mx.model.FeedForward(ctx=mx.gpu(), symbol=fea_symbol, numpy_batch_size=1,
arg_params=model_load.arg_params, aux_params=model_load.aux_params,
allow_extra_params=True)
[val_feature, valdata, vallabel]= feature_extractor.predict(val, return_data=True)
利用 scipy 保存 为 matlab格式 。毕竟matlab简单好操
import scipy.io as sio
sio.savemat('/xxx/xxx.mat', {'val_feature':val_feature})
7 利用 pretrain 模型来初始化你的网络参数。
再续
0 代码编译
git clone https://github.com/dmlc/mxnet.git
git clone https://github.com/dmlc/mshadow.git
git clone https://github.com/dmlc/dmlc-core.git
git clone https://github.com/dmlc/ps-lite.git
make -j4
1 数据准备
参考 http://blog.csdn.net/a350203223/article/details/50263737 把数据转换成 REC 模式。
备注: make_list.py 可以自动生成 train 和 val 的 lst文件。 可使用参数 --train_ratio=XXX
2 跑数据
参考mxnet/example/image-classification里面train_cifar10.py 和 symbol_inception-bn-28-small.py
symbol文件主要用来保存网络结构
一个简单的3层CNN网络
symbol_UCM.py
[python]
view plain
copy
import find_mxnet
import mxnet as mx
def get_symbol(num_classes = 21):
data = mx.symbol.Variable('data')
# first conv
conv1 = mx.symbol.Convolution(data=data, kernel=(3,3), num_filter=128)
bn1 = mx.symbol.BatchNorm(data=conv1)
relu1 = mx.symbol.Activation(data=bn1, act_type="relu")
pool1 = mx.symbol.Pooling(data=relu1, pool_type="max",
kernel=(5,5), stride=(3,3))
# second conv
conv2 = mx.symbol.Convolution(data=pool1, kernel=(3,3), num_filter=196)
bn2 = mx.symbol.BatchNorm(data=conv2)
relu2 = mx.symbol.Activation(data=bn2, act_type="relu")
pool2 = mx.symbol.Pooling(data=relu2, pool_type="max",
kernel=(3,3), stride=(2,2))
# second conv
conv3 = mx.symbol.Convolution(data=pool2, kernel=(3,3), num_filter=196)
bn3 = mx.symbol.BatchNorm(data=conv3)
relu3 = mx.symbol.Activation(data=bn3, act_type="relu")
pool3 = mx.symbol.Pooling(data=relu3, pool_type="max",
kernel=(2,2), stride=(2,2), name="final_pool")
# first fullc
flatten = mx.symbol.Flatten(data=pool3)
fc1 = mx.symbol.FullyConnected(data=flatten, num_hidden=420)
relu4 = mx.symbol.Activation(data=fc1, act_type="relu")
# second fullc
fc2 = mx.symbol.FullyConnected(data=relu4, num_hidden=num_classes)
# loss
softmax = mx.symbol.SoftmaxOutput(data=fc2, name='softmax')
return softmax
train_UCM.py
[python]
view plain
copy
import find_mxnet
import mxnet as mx
import argparse
import os, sys
import train_model
parser = argparse.ArgumentParser(description='train an image classifer on UCMnet')
parser.add_argument('--network', type=str, default='UCM_128_BN3layer',
help = 'the cnn to use')
parser.add_argument('--data-dir', type=str, default='/home/panda/Ureserch/data/Scene/UCM/',
help='the input data directory')
parser.add_argument('--gpus', type=str, default='0',
help='the gpus will be used, e.g "0,1,2,3"')
parser.add_argument('--num-examples', type=int, default=1680,
help='the number of training examples')
parser.add_argument('--batch-size', type=int, default=64,
help='the batch size')
parser.add_argument('--lr', type=float, default=.01,
help='the initial learning rate')
parser.add_argument('--lr-factor', type=float, default=.94,
help='times the lr with a factor for every lr-factor-epoch epoch')
parser.add_argument('--lr-factor-epoch', type=float, default=5,
help='the number of epoch to factor the lr, could be .5')
parser.add_argument('--model-prefix', type=str,
help='the prefix of the model to load/save')
parser.add_argument('--num-epochs', type=int, default=80,
help='the number of training epochs')
parser.add_argument('--load-epoch', type=int,
help="load the model on an epoch using the model-prefix")
parser.add_argument('--kv-store', type=str, default='local',
help='the kvstore type')
# 存放训练信息,用来画 training curve
parser.add_argument('--log-file', type=str,default="xxx",
help='the name of log file')
parser.add_argument('--log-dir', type=str, default="/xxx/xxx/xxx/",
help='directory of the log file')
args = parser.parse_args()
# network
import importlib
net = importlib.import_module('symbol_' + args.network).get_symbol(21)
# data 如果没有 image-mean , 会自动计算,存放于 args.data_dir + "xxx.bin"
def get_iterator(args, kv):
data_shape = (3, 109, 109)
train = mx.io.ImageRecordIter(
path_imgrec = args.data_dir + "xxx.rec",
mean_img = args.data_dir + "xxx.bin",
data_shape = data_shape,
batch_size = args.batch_size,
rand_crop = True,
rand_mirror = True,
num_parts = kv.num_workers,
part_index = kv.rank)
val = mx.io.ImageRecordIter(
path_imgrec = args.data_dir + "xxxrec",
mean_img = args.data_dir + "xxx.bin",
rand_crop = False,
rand_mirror = False,
data_shape = data_shape,
batch_size = args.batch_size,
num_parts = kv.num_workers,
part_index = kv.rank)
return (train, val)
# train
train_model.fit(args, net, get_iterator)
3. 利用 log 画 training和val曲线
需用到matplotlib,提前安装
[python]
view plain
copy
import matplotlib.pyplot as plt
import numpy as np
import re
import argparse
parser = argparse.ArgumentParser(description='Parses log file and generates train/val curves')
parser.add_argument('--log-file', type=str,default="/home/panda/Ureserch/mxnet_panda/UCM_EXP/UCM_128_log_4",
help='the path of log file')
args = parser.parse_args()
TR_RE = re.compile('.*?]\sTrain-accuracy=([\d\.]+)')
VA_RE = re.compile('.*?]\sValidation-accuracy=([\d\.]+)')
log = open(args.log_file).read()
log_tr = [float(x) for x in TR_RE.findall(log)]
log_va = [float(x) for x in VA_RE.findall(log)]
idx = np.arange(len(log_tr))
plt.figure(figsize=(8, 6))
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.plot(idx, log_tr, 'o', linestyle='-', color="r",
label="Train accuracy")
plt.plot(idx, log_va, 'o', linestyle='-', color="b",
label="Validation accuracy")
plt.legend(loc="best")
plt.xticks(np.arange(min(idx), max(idx)+1, 5))
plt.yticks(np.arange(0, 1, 0.2))
plt.ylim([0,1])
plt.show()
4. 保存训练好的模型
在 train_model.py 加入如下代码,训练完成后保存
[python]
view plain
copy
prefix = 'UCM_MODEL'
iteration = args.num_epochs
model.save(prefix, iteration)
5. 利用保存的模型进行predict
predict_UCM.py
[python]
view plain
copy
import find_mxnet
import mxnet as mx
import logging
import argparse
import os, sys
import train_model
import numpy as np
# 这里用的 mxnet 的 imanet训练的 Inception模型, 其他模型同理
prefix = '/home/panda/Ureserch/mxnet_panda/inception-21k model/Inception'
iteration = 9
model_load = mx.model.FeedForward.load(prefix, iteration)
data_shape = (3, 224, 224)
# 数据准备 batch_size = 1.
val = mx.io.ImageRecordIter(
path_imgrec = '/xxx/xxx/' + "xxx.rec",
mean_img = '/xxx/xxx/' + "xxx.bin",
rand_crop = False,
rand_mirror = False,
data_shape = data_shape,
batch_size = 1)
[prob, data1, label1] = model_load.predict(val, return_data=True)
6 利用 pretrain模型提取任意层特征
feature_extraction.py
模型和数据准备如 step.5, 还是
[python]
view plain
copy
internals = model_load.symbol.get_internals()
# 记住要提取特征的那一层的名字。 我这是 flatten 。
fea_symbol = internals["flatten_output"]
feature_extractor = mx.model.FeedForward(ctx=mx.gpu(), symbol=fea_symbol, numpy_batch_size=1,
arg_params=model_load.arg_params, aux_params=model_load.aux_params,
allow_extra_params=True)
[val_feature, valdata, vallabel]= feature_extractor.predict(val, return_data=True)
利用 scipy 保存 为 matlab格式 。毕竟matlab简单好操
import scipy.io as sio
sio.savemat('/xxx/xxx.mat', {'val_feature':val_feature})
7 利用 pretrain 模型来初始化你的网络参数。
再续
相关文章推荐
- mxnet深度学习实战:跑自己的数据实验和一些问题总结
- mxnet深度学习实战:跑自己的数据实验和一些问题总结
- 亿级数据表分区实战总结(一些值得注意的问题)
- 自己学习MFC总结的一些问题(一)
- 自己学习MFC总结的一些问题(二)
- 从网上看到一些文章,自己再总结一下,在学习编程中一些要点
- Java与Flash的结合时候数据交换的一些学习总结
- C/C++学习笔记8:内存中数据对齐的问题总结
- 用JSON做数据传输格式中的一些问题总结
- iOS 自己总结的一些学习资料
- 有关自己的一些学习持续性问题
- 好开心呀,能用自己学习知识去做作业了,也算是解决一些问题吧。操作系统实践作业:短作业优先(SJF)和先来先服务算法(FCFS)
- 验证选择每日学习总结:DropDownList是否已选择验证、存储过程参数为sql字符串问题、将截断字符串或二进制数据。\r\n语句已终止
- WTL学习过程中遇到的一些问题总结
- 自己总结的一些C/C++语言基础问题
- 最近在ArcGIS Engine开发中关于调用gp工具过程出现COM 组件的调用返回了错误 HRESULT E_FAIL 错误的解决方法 和 学习oracle中遇到的一些问题总结
- (转)WTL学习过程中遇到的一些问题总结
- 学习QT的一些相关博客和自己的一点总结
- Java第一课 Java的一些基本概念,Java的起源、为什么我们要学习Java语言,Java跨平台原理剖析;Java环境变量的配置,初学者常犯的问题,Java语言的基本数据类型和Java的语句。
- 关于J2EE+android的学习,自己的一些总结