您的位置：首页 > 编程语言 > MATLAB

matlab deeplearning用于图像分类的源代码理解

2016-10-11 17:21 429 查看

这段时间研究matlab中 DeepLearning用于图像分类的一个例子，查找了一些资料，现在将自己读代码过程中读懂与不懂的地方总结一下。

一下是自己对代码的一些粗浅理解，不对的地方还请多多包涵！

matlab R2016a中现有的例程---DeepLearningImageClassificationExample，程序的大致流程如下图：

%% Image Category Classification Using Deep Learning
% This example shows how to use a pre-trained Convolutional Neural Network
% (CNN) as a feature extractor for training an image category classifier.
%
% Copyright 2016 The MathWorks, Inc.

function DeepLearningImageClassificationExample

%% 加载图像数据
% 从指定的网址下载图像数据集caltech101
% 网址为：http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz
% 下载可能会比较慢，可以先下载下来，用的时候直接将outputFolder改为存放的地址就可以（小菜就是那么做的）

url = 'http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz';
outputFolder = fullfile(tempdir, 'caltech101'); % 下载的图像数据集存放的文件夹地址

if ~exist(outputFolder, 'dir') % 确保只下载一次
disp('Downloading 126MB Caltech101 data set...');
untar(url, outputFolder); % 直接下载下来的为压缩文件，需要解压
end

% Caltech101数据集包含100多类的图像数据，为了节省时间，在本例中只选取其中的3类作为代表
% 选取的3类为'airplanes', 'ferry', 'laptop'

rootFolder = fullfile(outputFolder, '101_ObjectCategories');
categories = {'airplanes', 'ferry', 'laptop'};

% 利用matlab自带的 ImageDatastore 存储图像
% ImageDatastore 不事先下载图像，二是直到读取图像时才加载
% 这种机制保证了在处理较多的图像时，ImageDatastore的高效性
% imds中既包含图像也包含类别标签，类别标签与图像文件夹的名称一致
imds = imageDatastore(fullfile(rootFolder, categories), 'LabelSource', 'foldernames');

% 统计每一类的图像数量
tbl = countEachLabel(imds)

% 由于各类的图像数量不同，为了保持训练数据的平衡，需要对数据的数量进行调整
% 以数量最少的那一类为基准，将3类的数量都调整为该值，另外两类随机选取该数量

minSetCount = min(tbl{:,2}); % 确定各类中数量最小的值

imds = splitEachLabel(imds, minSetCount, 'randomize'); % 从另外两类随机选取该数量

countEachLabel(imds) % 重新统计各类的数量（此时，3类的数量相同）

% 验证输入的各类图像是否正确
airplanes = find(imds.Labels == 'airplanes', 1);
ferry = find(imds.Labels == 'ferry', 1);
laptop = find(imds.Labels == 'laptop', 1);

figure
subplot(1,3,1);
imshow(imds.Files{airplanes})
subplot(1,3,2);
imshow(imds.Files{ferry})
subplot(1,3,3);
imshow(imds.Files{laptop})

%% 加载预先训练好的卷积神经网络（CNN）
% "AlexNet" 是基于 ImagaeNet 训练的CNN网络
% 也可以先下载在使用，方法同上
cnnURL = 'http://www.vlfeat.org/matconvnet/models/beta16/imagenet-caffe-alex.mat';
cnnMatFile = fullfile(tempdir, 'imagenet-caffe-alex.mat');

if ~exist(cnnMatFile, 'file') % download only once
disp('Downloading pre-trained CNN model...');
websave(cnnMatFile, cnnURL);
end

%  "AlexNet" 网络被保存为 MatConvNet 格式
% 将网络导入神经网络工具箱中的对象 SeriesNetwork--convnet
% SeriesNetwork 可以查看网络结构、分类和提取某一层的特征

convnet = helperImportMatConvNet(cnnMatFile)

% 查看网络的各层
% 该网络有23层，主要分为：input、conv、relu、norm、pool、fully_connected、classification
% conv层 提取特征，一种模板（卷积核）提取一种特征，使用多个模板提取多种特征
% relu层 校正，如果提取到的值<0,则将其置为0，其余不变
% norm层 归一化，临近的5通道归一化，类似“侧抑制”的作用
% pool层 最大池化，用某一区域的最大值代替区域的各值，突出特征、降维
% fc层   全连接层
% classification，分类层
convnet.Layers

% 查看第一层
% 第一层定义了输入的维度，不同网络有不同的尺寸要求
% 该网络要求的输入图像尺寸为 227*227*3
convnet.Layers(1)

% 查看最后一层
% 该网络有1000个输出，最多可以对1000类进行分类
convnet.Layers(end)
numel(convnet.Layers(end).ClassNames)

%%
% 该CNN网络不能直接用于做分类任务，
% 需要对何种类别进行分类，则重新用该类的数据训练

%% 图像预处理
% 网络的输入要求 227*227*3
% 为避免将图像保存为该形式，在 ImageDatastore 上建立读函数，在传输过程中预处理图像
% 每次读取图像时调用读函数

% 设置 ImageDatastore 的读函数
imds.ReadFcn = @(filename)readAndPreprocessImage(filename);

%%
function Iout = readAndPreprocessImage(filename)

I = imread(filename);

% 对灰度图像，将图像重复 3 次以形成 RGB 图像
if ismatrix(I)
I = cat(3,I,I,I);
end

Iout = imresize(I, [227 227]);

% 注意：此处在Resize时没有保留图像的宽高比
% 这是因为在 Caltech101 数据集中，目标都是在图像的中心且占据其大部分
% 其他数据集则需要考虑宽高比
end

%% 准备训练图像和测试图像
% 30%用于训练，70%用于测试

[trainingSet, testSet] = splitEachLabel(imds, 0.3, 'randomize');

%% 提取训练特征
% 刚开始的层提取的是基本的图像特征，如 edges、blobs
% 后面的层组合这些基本特征形成更高级的特征
% 更高级特征组合成图像的更丰富的表达，更适合于识别任务

% 可视化第一层卷积网络的权重
w1 = convnet.Layers(2).Weights;

w1 = mat2gray(w1);
w1 = imresize(w1,5);

figure
montage(w1)
title('First convolutional layer weights')

%%
% 提取特征，关于用哪一层来提取特征
% 一般选取分类的前一层，在该网络中，该层为'fc7'
% 注意：activations 是在GPU上进行的，'MiniBatchSize'要与GPU的内存相适应
% 'MiniBatchSize'的值越大运算越快，GPU的消耗也越大
% 输出结果'OutputAs'按列'columns'存放，在多类线性SVM的训练中可以起到加速的作用

featureLayer = 'fc7';
trainingFeatures = activations(convnet, trainingSet, featureLayer, ...
'MiniBatchSize', 32, 'OutputAs', 'columns'); % 每列代表一幅图像的特征

%% 利用提取的特征训练多类SVM分类器
% 'Learners'指定为 'Linear'：训练时采用的是随机梯度下降算法
% 'Coding'指定为 'onevsall'：K 类就产生 K 个分类器
% 'ObservationsIn'指定为 'columns'：与特征的存储方式一致

% 获取训练的类标签
trainingLabels = trainingSet.Labels;

classifier = fitcecoc(trainingFeatures, trainingLabels, ...
'Learners', 'Linear', 'Coding', 'onevsall', 'ObservationsIn', 'columns');

%% 评估分类器
% 重复上述步骤，提取特征，预测标签

% 提取特征
testFeatures = activations(convnet, testSet, featureLayer, 'MiniBatchSize',32);

% 将提取的特征传入训练的分类器
predictedLabels = predict(classifier, testFeatures);

% 获取测试样本已知的标签
testLabels = testSet.Labels;

% 用混淆矩阵评价预测结果
% 返回值为3*3的矩阵，对角线为预测正确的值，其他位置为预测错误的值
confMat = confusionmat(testLabels, predictedLabels);

% 右除函数，计算每一类预测的正确率
confMat = bsxfun(@rdivide,confMat,sum(confMat,2))

% 计算平均正确率
mean(diag(confMat))

%% 在新的测试图像上测试新训练的分类器

newImage = fullfile(rootFolder, 'airplanes', 'image_0690.jpg');

% 预处理图像，调整为227*227*3
img = readAndPreprocessImage(newImage);

% 提取特征
imageFeatures = activations(convnet, img, featureLayer);

% 利用分类器做预测
label = predict(classifier, imageFeatures)

%% References
% [1] Deng, Jia, et al. "Imagenet: A large-scale hierarchical image
% database." Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE
% Conference on. IEEE, 2009.
%
% [2] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet
% classification with deep convolutional neural networks." Advances in
% neural information processing systems. 2012.
%
% [3] Vedaldi, Andrea, and Karel Lenc. "MatConvNet-convolutional neural
% networks for MATLAB." arXiv preprint arXiv:1412.4564 (2014).
%
% [4] Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding
% convolutional networks." Computer Vision-ECCV 2014. Springer
% International Publishing, 2014. 818-833.
%
% [5] Donahue, Jeff, et al. "Decaf: A deep convolutional activation feature
% for generic visual recognition." arXiv preprint arXiv:1310.1531 (2013).

displayEndOfDemoMessage(mfilename)
end

有些地方还是不太理解，比如：

1:、norm层具体是怎么实现以及有什么作用？

2、在提取特征时，为什么选取“fc7”，而不是其他的如“fc8”或者其他层？

希望有知道答案的大神不吝赐教！

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： matlab DeepLearning 源代码

相关文章推荐

新的分享

章节导航