您的位置:首页 > 其它

What's Wrong with Deep Learning?

2015-10-14 14:56 204 查看
YaneLeCun
 
Deep learningmethods have had a profound impact on a number of areas in recent years,including natural image understanding and speech recognition. Other areas seemon the verge of being similarly impacted,
notably natural language processing,biomedical image analysis, and the analysis of sequential signals in a varietyof application domains. But deep learning systems, as they exist today, havemany limitations.
First, they lackmechanisms for reasoning, search, and inference. Complex and/or ambiguousinputs require deliberate reasoning to arrive at a consistent interpretation.Producing structured outputs,
such as a long text, or a label map for imagesegmentation, require sophisticated search and inference algorithms to satisfycomplex sets of constraints. One approach to this problem is to marry deeplearning with structured prediction (an idea first presented
at CVPR 1997).While several deep learning systems augmented with structured predictionmodules trained end to end have been proposed for OCR, body pose estimation,and semantic segmentation, new concepts are needed for tasks that require morecomplex reasoning.
Second, they lackshort-term memory. Many tasks in natural language understanding, such asquestion-answering, require a way to temporarily store isolated facts.Correctly interpreting events in a video
and being able to answer questionsabout it requires remembering abstract representations of what happens in thevideo. Deep learning systems, including recurrent nets, are notoriouslyinefficient at storing temporary memories. This has led researchers to proposeneural
nets systems augmented with separate memory modules, such as LSTM,Memory Networks, Neural Turing Machines, and Stack-Augmented RNN. While theseproposals are interesting, new ideas are needed.
Lastly, they lackthe ability to perform unsupervised learning. Animals and humans learn most ofthe structure of the perceptual world in an unsupervised manner. While theinterest of the ML community
in neural nets was revived in the mid-2000s byprogress in unsupervised learning, the vast majority of practical applicationsof deep learning have used purely supervised learning. There is little doubtthat future progress in computer vision will require breakthroughs
inunsupervised learning, particularly for video understanding, But whatprinciples should unsupervised learning be based on?
Preliminary works ineach of these areas pave the way for future progress in image and videounderstanding.
 
译:

最近几年里,深度学习在包括自然图像理解和语音识别多个领域产生深远的影响。一些其他区域似乎也有类似的影响,特别是在自然语言处理,生物医学图像分析,连续信号分析等各种应用领域。但是直到今日,深度学习本身还是有很多得局限性。
 
第一,它们缺乏推理,调查和推理的机制。复杂的和(或)不明确的输入要求深思熟虑的推理从而获得一致性的解释。要产生结构化输出,如长文本,或图像分割的标记图,需要复杂的搜索和推理算法,以满足复杂的约束集。处理这一问题的一种方法是结合结构预测深度学习(想法首次出现在CVPR
1997)。虽然一些深度学习系统通过端到端训练的结构化预测模型来增强已经应用到了ORC,身体姿态估计以及语义分割上,但是对于要求更复杂推理的任务,新的概念是必要的。

第二,它们缺乏短期记忆。在自然语言理解许多任务,如问题回答,需要一种方法来临时存储孤立的因素。正确解释视频中的事件,并能够回答有关它的问题,就需要记住的视频中发生的事情的抽象表示。深度学习系统,包括递归神经网络(RNN),是出了名的低效存储临时记忆。这使得研究人员提出了通过增加独立存储模块,如LSTM,Memory
Networks, Neural Turing Machines, 以及Stack-Augmented RNN.来增强神经网络。这些建议是有趣的,新的思路也是需要的。

最后,它们缺乏执行监督学习的能力。动物和人类在学习大多数感性世界的结构时,是使用无监督的方式。虽然机器学习社区对神经网络的热情的复苏是在无监督学习获得进步的2000年代中期(应该是2006年hinton的RBN),但是深度学习在实际应用中绝大多数都使用纯粹的监督学习。毫无疑问,计算机视觉今后的进展需要无监督学习突破,特别是视频理解方面。但是无监督学习应该基于什么?
在这些领域的初步工作,为图像和视频理解在未来的进步铺平了道路。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  Deep Learning