您的位置：首页 > 编程语言 > Java开发

【实例】用cmd 引用 java 生成 conll文件（stanford-corenlp）

2018-03-01 13:32 591 查看

java -cp "*" -Xmx500m edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos -file english.txt -outputFormat conll
-----------------------------------------------------------------------------------
E:\cornlp\stanford-corenlp-full-2018-01-31>java -cp "*" -Xmx500m edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos -file english.txt -outputFormat conll
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
[main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [2.0 sec].

Processing file E:\cornlp\stanford-corenlp-full-2018-01-31\english.txt ... writing to E:\cornlp\stanford-corenlp-full-2018-01-31\english.txt.conll
Annotating file E:\cornlp\stanford-corenlp-full-2018-01-31\english.txt ... done [0.3 sec].

Annotation pipeline timing information:
TokenizerAnnotator: 0.2 sec.
WordsToSentencesAnnotator: 0.0 sec.
POSTaggerAnnotator: 0.1 sec.
TOTAL: 0.3 sec. for 34 tokens at 133.9 tokens/sec.
Pipeline setup: 2.1 sec.

Total time for StanfordCoreNLP pipeline: 2.5 sec.
------------------------------------------------------------------------------------

--------------------------------------------------------------------
E:\cornlp\stanford-corenlp-full-2018-01-31>java -cp "*" -Xmx500m edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos -encoding utf-8 /file a.txt -outputFormat conll
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
[main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [2.2 sec].

Entering interactive shell. Type q RETURN or EOF to quit.
NLP> StanfordCoreNLP(r'E:/cornlp/stanford-corenlp-full-2018-01-31/',lang='zh')
1 StanfordCoreNLP _ NN _ _ _
2 -LRB- _ -LRB- _ _ _
3 r _ NN _ _ _
4 ` _ `` _ _ _
5 E _ NN _ _ _
6 : _ : _ _ _
7 / _ : _ _ _
8 cornlp/stanford-corenlp-full _ JJ _ _ _
9 -2018-01-31 _ CD _ _ _
10 / _ : _ _ _
11 ' _ '' _ _ _
12 , _ , _ _ _
13 lang _ NN _ _ _
14 = _ JJ _ _ _
15 ` _ `` _ _ _
16 zh _ NN _ _ _
17 ' _ '' _ _ _
18 -RRB- _ -RRB- _ _ _

NLP>

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航