您的位置：首页 > 其它

DLch1

2015-12-11 02:13 417 查看

Introduction

Problems that are intellectually diﬃcult for human beings but relatively straight-forward for computers—problems that can be described by a list of formal, mathematical rules.

Challenges that solving the tasks that are easy for people to perform but hard for people to describe formally—problems that we solve intuitively, that feel automatic.

This solution is to allow computers to learn from experience and understand the world in terms of a hierarchy of concepts, with each concept deﬁned in terms of its relation to simpler concepts. By gathering knowledge from experience, this approach avoids the need for human operators to formally specify all of the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones. If we draw a graph showing how these concepts are built on top of each other, the graph is deep, with many layers.

The performance of these simple machine learning algorithms depends heavily on the representation of the data they are given.

Use machine learning to discover not only the mapping from representation to output but also the representation itself. This approach is known as representation learning.

The quintessential example of a representation learning algorithm is the autoencoder.

The quintessential example of a deep learning model is the feed forward deep network or multiplayer perceptron (MLP).

The idea of learning the right representation for the data provides one perspective on deep learning. Another perspective on deep learning is that it allows thecomputer to learn a multi-step computer program.

There are two main ways of measuring the depth of a model.

The ﬁrst view is based on the number of sequential instructions that must be executed to evaluate the architecture. We can think of this as the length of the longest path through a ﬂow chart that describes how to compute each of the outputs of the model given its inputs.

Another approach, used by deep probabilistic models, illustrates not the depth of the computational graph but the depth of the graph describing how concepts are related to each other. In this case, the depth of the ﬂow-chart of the computations needed to compute the representation of each concept may be much deeper than the graph of the concepts themselves.

deep learning can safely be regarded as the study of models that either involve a greater amount of composition of learned functions or learned concepts than traditional machine learning does.

1.1 Who Should Read This Book

1.2 Historical Trends in Deep Learning

Broadly speaking, there have been three waves of development of deep learning:deep learning known as cybernetics in the 1940s-1960s;deep learning known as connectionism in the 1980s-1990s;the current resurgence under the name deep learning beginning in 2006.

The main reason for the diminished role of neuroscience in deep learning research today is that we simply do not have enough information about the brain to use it as a guide.

This suggests that much of the mammalian brain might use a single algorithm to solve most of the different tasks that the brain solves.

One should not view deep learning as an attemptto simulate the brain. Modern deep learning draws inspiration from many fields,especially applied math fundamentals.

In the 1980s, the second wave of neural network research emerged in great part via a movement called connectionism or parallel distributed processing; The central idea is that a large number of simple computational units can achieve intelligent behavior when networked together.Several key concepts arose:

One is that distributed representation : each input to a system should be represented by many features, and each feature should be involved in the representation of many possible inputs, the concept of distributed representation is central to this book.

Another was the successful use of back-propagation to train deep neural networks with internal representations, as of this writing is currently the dominant approach to training deep models.

Due to improvements in other fields of machine learning: kernel machines and graphical models;

The third wave began with a breakthrough in 2006: Geoffrey Hinton showed that a kind of neural network called a deep belief network could be efficiently trained using a strategy called greedy layer-wise pretraining. In part because the time and memory cost of training akernel machine is quadratic in the size of the dataset, and datasets grew to be large enough for this cost to outweigh the beneﬁts of convex optimization.

The most important new development is that today we can provide these algorithms with the resources they need to succeed.As of 2015, a rough rule of thumb is that a supervised deep learning algorithm will generally achieve acceptable performance with around 5,000 labeled examples per category, and will match or exceed human performance when trained with a dataset containing at least 10 million labeled examples.

Another key reason that neural networks are wildly successful today is that we have the computational resources to run much larger models today.

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航