您的位置：首页 > 其它

Deep learning:一softmax Regression 练习

2014-12-04 09:30 155 查看

引言：

参看的是http://www.cnblogs.com/tornadomeet/archive/2013/03/23/2977621.html 和 http://deeplearning.stanford.edu/wiki/index.php/Exercise:Softmax_Regression

主要完成的是高光谱数据，训练样本103*42776，测试样本是103*21391，实验环境是MATLAB2009a

实验理论：

只用了softmax模型，没有隐含层，只有输入输出，，输入为原始的高光谱图像，全部数据作为训练，一半数据作为预测。在试验中主要计算误差函数和其偏导数。

其推理过程如下：

oftmax regression中对参数的最优化求解不只一个，每当求得一个优化参数时，如果将这个参数的每一项都减掉同一个数，其得到的损失函数值也是一样的。这说明这个参数不是唯一解。用数学公式证明过程如下所示：

　　

　　那这个到底是什么原因呢？从宏观上可以这么理解，因为此时的损失函数不是严格非凸的，也就是说在局部最小值点附近是一个”平坦”的，所以在这个参数附近的值都是一样的了。那么怎样避免这个问题呢？其实加入规则项就可以解决（比如说，用牛顿法求解时，hession矩阵如果没有加入规则项，就有可能不是可逆的从而导致了刚才的情况，如果加入了规则项后该hession矩阵就不会不可逆了），加入规则项后的损失函数表达式如下：

　　

　　这个时候的偏导函数表达式如下所示：

　　

注意的事项：
MATLAB程序的实现过程为：在softmaxCost函数中，groundTruth=full(sparse(labels,1:numCase,1))可能不是很好理解：比如data=[1 2 3 4;5 6 7 8],是一个2*4的矩阵，labels为[3 2 4 1],sparse(labels,1:numCase,1):(3,1) 1;(2,1) 1;(4,3) 1;(1,4) 1;比如（1,4）表示标签为1时第4个样本为1，即1{y(4)=1}=0,如果y(4)=其他则要为0；进一步扩展矩阵

0 0 0 1

0 1 0 0

1 0 0 0

0 0 1 0

在softmaxPredict中：theta=softmaxModel.optTheta;

pred=zeros(1,size(data,2));

[nop,pred]=max(theta*data); nop为每一行中最大的数，pred为该数对应的类别是多少；利用acc=mean(labels(:)==pred(:))来计算精确度

实验结果为：74.106%,精度很低，说明softmax不能直接用来对数据进行分类，相比于SVM精度很低。

还需进一步完善的地方：在MATLAB中矩阵的乘法还不是很熟悉，有待进一步练习；

附录代码：

softmaxExercise

clc;
clear all;

%%======================================================================
%% STEP 0: Initialise constants and parameters
%
%  Here we define and initialise some constants which allow your code
%  to be used more generally on any arbitrary input.
%  We also initialise some parameters used for tuning the model.

inputSize=103;
numClasses=9;
lambda=1e-4;

%%======================================================================
%% STEP 1: Load data
%
%  In this section, we load the input and output data.
%  For softmax regression on MNIST pixels,
%  the input data is the images, and
%  the output data is the labels.
load one.mat
train_data=[train_data test_data];
train_data=(train_data-min(train_data(:)))./(max(train_data(:))-min(train_data(:)));
images = train_data;
labels = [train_label;test_label];
%labels(labels==0)=10;

inputData=images;

% DEBUG = true; % Set DEBUG to true when debugging.
DEBUG = false;
if DEBUG
inputSize = 8;
inputData = randn(8, 100);
labels = randi(10, 100, 1);
end

theta=0.005*randn(numClasses*inputSize,1);

%%======================================================================
%% STEP 2: Implement softmaxCost
%
%  Implement softmaxCost in softmaxCost.m.

[cost,grad]=softmaxCost(theta,numClasses,inputSize,lambda,inputData,labels);

%%======================================================================
%% STEP 3: Gradient checking
%
%  As with any learning algorithm, you should always check that your
%  gradients are correct before learning the parameters.
%

if DEBUG
numGrad = computeNumericalGradient( @(x) softmaxCost(x, numClasses, ...
inputSize, lambda, inputData, labels), theta);

% Use this to visually compare the gradients side by side
disp([numGrad grad]);

% Compare numerically computed gradients with those computed analytically
diff = norm(numGrad-grad)/norm(numGrad+grad);
disp(diff);
% The difference should be small.
% In our implementation, these values are usually less than 1e-7.

% When your gradients are correct, congratulations!
end

%% STEP 4: Learning parameters
%
%  Once you have verified that your gradients are correct,
%  you can start training your softmax regression code using softmaxTrain
%  (which uses minFunc).

options.maxIter=100;
%softmaxModel其实只是一个结构体，里面包含了学习到的最优参数以及输入尺寸大小和类别个数信息
softmaxModel=softmaxTrain(inputSize,numClasses,lambda,inputData,labels,options);

%%======================================================================
%% STEP 5: Testing
%
%  You should now test your model against the test images.
%  To do this, you will first need to write softmaxPredict
%  (in softmaxPredict.m), which should return predictions
%  given a softmax model and the input data.
test_data=(test_data-min(test_data(:)))./(max(test_data(:))-min(test_data(:)));
images = test_data;
labels = test_label;
%labels(labels==0) = 10; % Remap 0 to 10

inputData=images;
size(softmaxModel.optTheta);
size(inputData);

[pred]=softmaxPredict(softmaxModel,inputData);
acc=mean(labels(:)==pred(:));

fprintf('Accurancy: %0.3f%%\n', acc*100);

softmaxCost

function [cost,grad]=softmaxCost(theta,numClasses,inputSize,lambda,data,labels)

theta=reshape(theta,numClasses,inputSize);

numCase=size(data,2);
groundTruth=full(sparse(labels,1:numCase,1));%%不容易理解的地方

cost = 0;
thetagrad = zeros(numClasses, inputSize);

M=bsxfun(@minus,theta*data,max(theta*data,[],1));
M=exp(M);
p=bsxfun(@rdivide,M,sum(M));

cost=-1/numCase*groundTruth(:)'*log(p(:))+lambda/2*sum(theta(:).^2);

thetagrad=-1/numCase*(groundTruth-p)*data'+lambda*theta;
grad =thetagrad(:);

end

softmaxTrain

function softmaxModel=softmaxTrain(inputSize,numClasses,lambda,inputData,labels,options)

if ~exist('options', 'var')
options = struct;
end

if ~isfield(options, 'maxIter')
options.maxIter = 400;
end

theta = 0.005 * randn(numClasses * inputSize, 1);

addpath minFunc/
options.Method='lbfgs';

minFuncOptions.display='on';

[softmaxOptTheta, cost] = minFunc( @(p) softmaxCost(p, ...
numClasses, inputSize, lambda, ...
inputData, labels), ...
theta, options);

% Fold softmaxOptTheta into a nicer format
softmaxModel.optTheta = reshape(softmaxOptTheta, numClasses, inputSize);
softmaxModel.inputSize = inputSize;
softmaxModel.numClasses = numClasses;

end

softmaxPredict

function [pred]=softmaxPredict(softmaxModel,data)

theta=softmaxModel.optTheta;
pred=zeros(1,size(data,2));

[nop,pred]=max(theta*data);
%nop为每一列最大的数，pred为每一列中索引数；

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航