您的位置：首页 > 编程语言

【deep learning学习笔记】注释yusugomori的SDA代码 -- 准备工作

2013-08-04 20:50 826 查看

1. SDA基本原理

其实是承接前面几篇博文，继续注释yusugomori的deep learning代码，注释到SDA了，顺便说一下基本原理。

SDA是stack denosing autoencoders的缩写。比起标准的DA（denosing autoencoders），主要区别在于：1. 多了一些隐含层——stack的含义；2. 最上一层是LR（logestic regresssion)，用softmax来计算每个类别的概率。每两个隐含层之间还是用标准DA的训练算法来训练。最后一层，用LR算法来训练。其时隐含层也可以用RBM来代替。RBM的基本原理与DA不同，不过作用是一样的，都是输入层的表示方式。

2. hidden layer

hidden layer和DA在代码中都是SDA的组件，在网络结构上，hidden layer和DA共享同样的网络结构。那为什么会声明两个东西呢（一个是hidden layer，一个是DA）？因为SDA中中要用到sample的过程（计算完概率之后，根据当前概率来进行贝努利实验，得到0-1输出），这个在DA的代码中没有（在RBM的代码中有）。其实如果单独写一个sample函数，也就没必要弄出一个hidden layer类出来。不过yusugomori的源代码中还有DNN，可能这样写方便DNN的实现吧。

3. HiddenLayer.h

头文件，如下：

class HiddenLayer
{
public:
int N;			// the number of training samples
int n_in;		// the node number of input layer
int n_out;		// the node number of output layer
double **W;		// the network weights
double *b;		// the bias

// allocate memory and initialize the parameters
HiddenLayer (
int, 		// N
int, 		// n_in
int, 		// n_out
double**, 	// W
double*		// b
);
~HiddenLayer();
// calculate the value of a certain node in hidden layer
double output (
int*, 		// input value vector
double*, 	// the network weight of the node in hidden layer
double		// the bias of the node in hidden layer
);
// sample the 0-1 state of hidden layer given the input
void sample_h_given_v (
int*, 		// input value vector
int*		// the output 0-1 state of hidden layesr
);
};

4. HiddenLayer的实现是在Sda.cpp中实现的，代码片段如下：

// HiddenLayer
HiddenLayer::HiddenLayer (
int size, 			// N
int in, 			// n_in
int out, 			// n_out
double **w, 		// W
double *bp			// b
)
{
N = size;
n_in = in;
n_out = out;

if(w == NULL)
{
// allocate memory for W
W = new double*[n_out];
for(int i=0; i<n_out; i++)
W[i] = new double[n_in];
// the initial value
double a = 1.0 / n_in;
for(int i=0; i<n_out; i++)
{
for(int j=0; j<n_in; j++)
{
W[i][j] = uniform(-a, a);
}
}
}
else
{
W = w;
}

if(bp == NULL)
{
b = new double[n_out];
memset (b, 0, sizeof(int));	// I add this to initialize b
}
else
{
b = bp;
}
}

HiddenLayer::~HiddenLayer()
{
// clear W and b
for(int i=0; i<n_out; i++)
delete W[i];
delete[] W;
delete[] b;
}

double HiddenLayer::output (
int *input,
double *w,
double b
)
{
// iterate all the input nodes and calcualte the output of the hidden node
double linear_output = 0.0;
for(int j=0; j<n_in; j++)
{
linear_output += w[j] * input[j];
}
linear_output += b;
return sigmoid(linear_output);
}

void HiddenLayer::sample_h_given_v (
int *input,
int *sample
)
{
for(int i=0; i<n_out; i++)
{
// get the result of binomial test
sample[i] = binomial(1, output(input, W[i], b[i]));
}
}

5. 另外，SDA中还用到了LR、DA等。

他们的头文件注释参见我从前的博文，实现代码都放到了Sda.cpp中，和之前dA.cpp、LR.cpp对比，没有差别，只是简单的copy&paste。这些代码注释在这里也都不详细写了，有需要的请参考我前面的“【deep learning学习笔记】注释yusugomori的xxx代码”系列博文。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航