您的位置：首页 > 编程语言 > Java开发

Stanford parser入门1：单句中文句法分析

2017-10-01 14:02 190 查看

开发工具：win10 + java8(jdk-8u111) + stanford-parser-full-2015-12-09

在eclipse中运行standfordparser官方java例程请参考“使用 StanfordParser 进行句法分析”一文。其中，以ParserDemo.java为例，在Eclipse中右键点击ParserDemo.java文件，设置运行参数Arguments为：
edu/stanford/nlp/models/lexparser/chinesePCFG.ser.gz data/chinese-onesent-utf8.txt
如此，可进行中文句法分析。

这里是一个简单中文句法分析的例子。
1.在Stanford官方网站下载最新安装包
http://nlp.stanford.edu/software/lex-parser.html#Download

2.解压下载后的zip包
stanford-parser-full-2015-12-09.zip，里面会有数据，依赖包以及demo，还有相关的source
code和java doc
3.使用Eclipse创建项目，名为stanfordparser，在build
path中引入stanford-parser-3.6.0-models.jar，stanford-parser.jar，slf4j-simple.jar，
slf4j-api.jar

4.从步骤2中解压的文件中把data文件夹复制到Eclipse项目中，新建ParserTest1.java类，代码如下：

import java.io.IOException;
import edu.stanford.nlp.parser.lexparser.LexicalizedParser;
import edu.stanford.nlp.trees.Tree;
 
public class ParserTest1 {
public static void main(String[]args)throws IOException {
//    String grammar = "edu/stanford/nlp/models/lexparser/chineseFactored.ser.gz";
      String grammar ="edu/stanford/nlp/models/lexparser/chinesePCFG.ser.gz";
      String[] options = {};
      LexicalizedParser lp = LexicalizedParser.loadModel(grammar,options);
      String line ="这 是 一个 简单 的 例子";
      Tree parse =lp.parse(line);
      parse.pennPrint();
      String[] arg2 = {"-encoding","utf-8",
             "-outputFormat","penn,typedDependenciesCollapsed",
          "edu/stanford/nlp/models/lexparser/chineseFactored.ser.gz",
             "data/chinese-onesent-utf8.txt"};
      LexicalizedParser.main(arg2);
   }
}

5.运行，输出的结果为：
[main] INFOedu.stanford.nlp.parser.lexparser.LexicalizedParser - Loading parser fromserialized
file edu/stanford/nlp/models/lexparser/chinesePCFG.ser.gz ...
done [1.6 sec].
(ROOT
(IP
   (NP (PN这))
   (VP (VC是)
     (NP
       (QP (CD一个))
       (CP
         (IP
           (VP
(VA 简单)))
         (DEC的))
       (NP (NN例子))))))
[main] INFOedu.stanford.nlp.parser.lexparser.LexicalizedParser - Loading parser fromserialized
file edu/stanford/nlp/models/lexparser/chineseFactored.ser.gz ...
done [6.0 sec].
Parsing file:data/chinese-onesent-utf8.txt
Parsing [sent. 1 len. 8]:俄国希望伊朗没有制造核武器计划。
(ROOT
(IP
   (NP (NR俄国))
   (VP (VV希望)
     (IP
       (NP (NR伊朗))
       (VP
         (ADVP
(AD 没有))
         (VP (VV制造)
           (NP
(NN 核武器) (NN计划))))))
   (PU。)))

nsubj(希望-2,俄国-1)
root(ROOT-0,希望-2)
nsubj(制造-5,伊朗-3)
neg(制造-5,没有-4)
ccomp(希望-2,制造-5)
nn(计划-7,核武器-6)
dobj(制造-5,计划-7)

Parsed file: data/chinese-onesent-utf8.txt[1 sentences].
Parsed 8 words in 1 sentences (22.66 wds/sec;2.83 sents/sec).

参考资料：
stanford parser使用说明
http://blog.csdn.net/u010454729/article/details/46845403
使用Stanford
Parser进行句法分析
http://www.cnblogs.com/Denise-hzf/p/6612574.html

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： StanfordNLP java

相关文章推荐

新的分享

章节导航