您的位置:首页 > 理论基础 > 计算机网络

机器学习中的神经网络Neural Networks for Machine Learning:Lecture 15 Quiz

2016-02-18 09:44 701 查看


Lecture 15 QuizHelp
Center

Warning: The hard deadline has passed. You can attempt it, but you will not get credit for it. You are welcome to try it as a learning exercise.

In
accordance with the Coursera Honor Code, I  certify that the answers here are my own work.


Question 1

The objective function of an autoencoder is to reconstruct its input, i.e., it is trying to learn a function f,
such that f(x)=x for
all points x in
the dataset. Clearly there is a trivial solution to this. f can
just copy the input to the output, so that f(x)=x for
all x.
Why does the network not learn to do this ?

Optimization
algorithms that are used to train the autoencoder are not exact. So the network cannot easily learn to copy the input to the output.
The
network has constraints, such as bottleneck layers, sparsity and bounded activation functions which make the network incapable of copying the entire input all the way to the output.
The
objective function used to train an autoencoder is to minimize reconstruction error if x lies
in the dataset but maximize it if x does
not belong to the dataset. This prevents it from copying the input to the output for all x.
Since
all the hidden units in an autoencoder are linear, the model is not powerful enough to learn the identity transform, unless the number of hidden units in each layer is not less than the number of input dimensions.


Question 2

The process of autoencoding a vector seems to lose some information since the autoencoder cannot reconstruct the input exactly (as seen by the blurring of reconstructed images reconstructed from 256-bit codes). In other words, the intermediate representation
appears to have less information than the input representation. In that case, why is this intermediate representation more useful than the input representation ?

The
intermediate representation has more noise.
The
intermediate representation actually has more information than the inputs.
The
intermediate representation loses some information, but retains what is most important. We hope that this retained information is "semantic". The intermediate representation will then be a more direct way of representing semantic content.
The
intermediate representation is more compressible than the input representation.


Question 3

What are some of the ways of regularizing deep autoencoders?

Using
large minibatches for stochastic gradient descent.
Using
a squared error loss function for the reconstruction.
Adding
noise to the inputs.
Using
high learning rate and momentum.


Question 4

In all the autoencoders discussed in the lecture, the decoder network has the same number of layers and hidden units as the encoder network, but arranged in reverse order. Brian feels that this is not a strict requirement for building an autoencoder. He insists
that we can build an autoencoder which has a very different decoder network than the encoder network. Which of the following statements is correct?

Brian
is correct. We can indeed have any decoder network, as long as it produces output of the same shape as the data, so that we can compare the output to the original data and tell the network where it's making mistakes.
Brian
is correct, as long as the decoder network has at least as many parameters as the encoder network.
Brian
is mistaken. The decoder network must have the same architecture. Otherwise backpropagation will not work.
Brian
is correct, as long as the decoder network has the same number of parameters as the encoder network.


Question 5

Another way of extracting short codes for images is to hash them using standard hash functions. These functions are very
fast to compute, require no training and transform inputs into fixed length representations. Why is it more useful to learn an autoencoder to do this ?

Autoencoders
have several hidden units, unlike hash functions.
Autoencoders
have smooth objective functions whereas standard hash functions have no concept of an objective function.
For
an autoencoder, it is possible to invert the mapping from the hashed value to the reconstruct the original input using the decoder, while this is not true for most hash functions.
Autoencoders
can be used to do semantic hashing, where as standard hash functions do not respect semantics , i.e, two inputs that are close in meaning might be very far in the hashed space.


Question 6

RBMs and single-hidden layer autoencoders can both be seen as different ways of extracting one layer of hidden variables from the inputs. In what sense are they different ?

The
objective function and its gradients are intractable to compute exactly for RBMs but can be computed efficiently exactly for autoencoders.
RBMs
are undirected graphical models, but autoencoders are feed-forward neural nets.
RBMs
define a probability distribution over the hidden variables conditioned on the visible units while autoencoders define a deterministic mapping from inputs to hidden variables.
RBMs
work only with binary inputs but autoencoders work with all kinds of inputs.


Question 7

Autoencoders seem like a very powerful and flexible way of learning hidden representations. You just need to get lots of data and ask the neural network to reconstruct it. Gradients and objective functions can be exactly computed. Any kind of data can be plugged
in. What might be a limitation of these models ?

Autoencoders
cannot work with discrete-valued inputs.
The
hidden representations are noisy.
If
the input data comes with a lot of noise, the autoencoder is being forced to reconstruct noisy input data. Being a deterministic mapping, it has to spend a lot of capacity in modelling the noise in order to reconstruct correctly. That capacity is not being
used for anything semantically valuable, which is a waste.
The
inference process for finding states of hidden units given the input is intractable for autoencoders.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息