您的位置:首页 > 理论基础 > 计算机网络


2017-07-03 18:43 357 查看
1.为什么要引入神经网络(Neural Network)

一句话总结就是当特征值n特别大时,比如当n为100时;仅仅是其2次项特征值(x21,x1x2,x1x3…x1x100;x22,x2x3…x2x100;…)就有大约5000个(从100累加到1)。而在实际问题中n的值往往有上百万,上亿。所以这样就非常容易导致过度拟合,以及计算量大的问题。因此,便引入了神经网络(neural network)。

2.神经网络模型(Neural Network Model)

Let’s examine how we will represent a hypothesis function using neural networks. At a very simple level, neurons are basically computational units that take inputs (dendrites) as electrical inputs (called “spikes”) that are channeled to outputs (axons). In our model, our dendrites are like the input features x1…xn, and the output is the result of our hypothesis function. In this model our x0 input node is sometimes called the “bias unit.” It is always equal to 1. In neural networks, we use the same logistic function as in classification, 11+e−θTx, yet we sometimes call it a sigmoid (logistic) activation function. In this situation, our “theta” parameters are sometimes called “weights“.

如图就是一个只包含一个神经元的模型,黄色圆圈为神经元细胞(cell body),


其中x0=1,称为 bias unit,a(2)0称为mixture bias unit,也为1。通常我们不需要表示出来,知道其存在就好。另外,我们称Layer1为输入层(input layer),Layer2为输出层(output layer),中间的所有(这儿仅Layer2)层都称为隐藏层(hidden layer)。并且在这个例子中,我们称a20,a21,a22,a23为活化单元(activation unit)。

3.神经网络的数学定义(Mathematical definition)




The values for each of the “activation” nodes is obtained as follows:


This is saying that we compute our activation nodes by using a 3×4 matrix of parameters. We apply each row of the parameters to our inputs to obtain the value for one activation node. Our hypothesis output is the logistic function applied to the sum of the values of our activation nodes, which have been multiplied by yet another parameter matrix Θ(2) containing the weights for our second layer of nodes.


Each layer gets its own matrix of weights, Θ(j). The dimensions of these matrices of weights is determined as follows:

If network has sj units in layer j and sj+1 units in layer j+1, then Θ(j) will be of dimension sj+1×(sj+1).

4.矢量化(Vectorized implementation)





setting a(2)0=1⟹z(3)=Θ(2)a(2)⟹hΘ(x)=a(3)=g(z(3))


z(j)=Θ(j−1)a(j−1)……(1)a(j)=g(z(j)),setting a(j)0=1……(2)hΘ(x)=a(j)=g(z(j))……(3)

重复进行(1)(2),然后可以发现最后一层的活化单元(activation unit)的值就是hΘ(x)。另外由于这个过程是从input layer⟹hidden layer⟹output layer所以又叫做正向传播(forward propagation)

5.Examples and Intuitions

用神经网络实现逻辑门(Logical gate)

例1.与门(And gate)

Remember that x0 is our bias variable and is always 1.

Let’s set our first theta matrix as:


This will cause the output of our hypothesis to only be positive if both x1 and x2 are 1. In other words:

hΘ(x)=g(−30+20x1+20x2)x1=0 and x2=0 then g(−30)≈0x1=0 and x2=1 then g(−10)≈0x1=1 and x2=0 then g(−10)≈0x1=1 and x2=1 then g(10)≈1

例2.同或门(And gate)

Layer 2:


Layer 3:


内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息