读论文有感:A Sample But Tough-To-Beat Baseline For Sentence Embedding
2018-04-02 10:44
615 查看
该算法有着一定的意义,即通过分析,对Word Embeddings进行加权平均,得到比单纯平均或以TF-IDF为权值的平均向量更好的结果,因计算简单,如作者所述,作为一个更好的Baseline是很好的选择不过该论文的一些说法有点言过其实,甚至进行了一点小tricks,比如说比supervised 的LSTM有着更好的效果这一说法,有着一定的争议,因为Sentence Embedding实则也是一种特征提取。神经网络虽然功能强大,但是最怕就是“无米之炊”,数据不对或不好,那么表现往往不如人为地根据任务进行的特征抽取。而本文就是有这样的小tricks,用SentencePair这种任务数据去训练LSTM,恩,我觉得该任务本身目前并不能我们所愿的去捕捉我们想要的信息,而单纯的LSI(词频矩阵进行SVD得到句向量)或LDA(Topic Model)也能达到很好的性能。这是人为抽取特征与自动学习的特征的争执之处,更好的任务和数据能够让LSTM学得更好的特征,有着更大的发挥潜力。还有关于训练语料部分,我看了下,似乎文中的方法会先对测试数据过一遍调参?而有监督的方法其他方法不能针对所给语料进行参数调整?如果是这样的话,我想这也是实验结果有着差距的重要原因
相关文章推荐
- [NLP论文阅读]A simple but tough-to-beat baseline for sentence embedding
- A simple but tough-to-beat baseline for sentence embedding
- [NLP]论文笔记-A SIMPLE BUT TOUGH-TO-BEAT BASELINE FOR SENTENCE EMBEDDINGS
- [深度学习论文笔记][Image to Sentence Generation] Deep Visual-Semantic Alignments for Generating Image Descri
- 【读论文】Incorporate Group Information to Enhance Network Embedding
- 1606.End--to-End Comparative Attention Networks for Person Re-identification 论文笔记
- FaceNet-A Unified Embedding for Face Recognition and Clustering 论文解读
- How to generate xml through c# code for below xml sample
- [转]simple sample to create and use widget for nopcommerce
- 论文笔记:PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection
- The package needs to be reinstalled,but I can't find an archive for it
- 论文笔记:An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application
- 笔记-论文-PCANet A Simple Deep Learning Baseline for Image Classification?
- This kernel requires an x86-64 CPU, but only detected an i686 CPU. unable to boot - please ues a kernel appropriate for your CPU.
- 论文阅读:CVPR 2015 FaceNet: A Unified Embedding for Face Recognition and Clustering
- 论文笔记--STC: A Simple to Complex Framework for Weakly-supervised Semantic Segmentation
- -[UIWindow viewForFirstBaselineLayout]: unrecognized selector sent to instance
- 论文笔记之:DualGAN: Unsupervised Dual Learning for Image-to-Image Translation
- An exception has occurred while using the formatter ‘JsonMediaTypeFormatter’ to generate sample
- 论文引介 | A Structured Self-attentive Sentence Embedding