lucene解析器分析
2014-05-07 17:26
232 查看
import java.io.IOException; import java.io.StringReader; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.TokenStream; import org.apache.lucene.analysis.core.SimpleAnalyzer; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.analysis.tokenattributes.CharTermAttribute; import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; import org.apache.lucene.util.Version; public class AnalyzerTest { public static void analysis(Analyzer analyzer, String txt) throws IOException { System.out.println("analyzer:" + analyzer.getClass()); TokenStream stream = analyzer.tokenStream("content", new StringReader(txt)); stream.reset(); // while (stream.incrementToken()) { CharTermAttribute attribute = stream.getAttribute(CharTermAttribute.class); OffsetAttribute offsetAttribute = stream.getAttribute(OffsetAttribute.class); System.out.println("off:" + offsetAttribute.startOffset() + "----" + offsetAttribute.endOffset()); System.out.println("attr:" + attribute.toString()); } } public static void main(String[] args) throws IOException { Analyzer a = new StandardAnalyzer(Version.LUCENE_48); a = new SimpleAnalyzer(Version.LUCENE_48); // a = new CJKAnalyzer(Version.LUCENE_48); //a = new MyStopAnalyzer(); String txt = "this is a txt"; System.out.println("textLength:" + txt.length()); System.out.println("0-4:" + txt.substring(5, 7)); String zhTxt = "这是中文测试,hello 中文 The i am i am"; //analysis(a, txt); analysis(a, zhTxt); } }
相关文章推荐
- [ lucene扩展 ] spellChecker原理分析
- springMVC源码分析--HandlerMethodReturnValueHandlerComposite返回值解析器集合(二)
- Lucene学习总结之四:Lucene索引过程分析(3)
- 【Lucene4.8教程之四】分析
- 语言分析包org.apache.lucene.analysis
- Lucene学习总结之四:Lucene索引过程分析(2)
- lucene4.5源码分析系列:lucene概述
- hsqldb源码分析系列2 解析器分析
- Lucene 3.0 原理与代码分析
- Nutch/Lucene的存取机制与结构分析
- Lucene/Solr/ElasticSearch搜索问题案例分析
- Lucene4源代码分析之二:Lucene简介
- springMVC的适配器+解析器+控制器等分析
- Lucene的分析过程
- lucene.net 2.0分析-1-草稿
- lucene4.5源码分析系列:搜索过程
- lucene源码分析---1
- 《自己动手建搜索引擎》日志分析类代码解析与修正为兼容lucene3.0.2
- springMVC源码分析--HandlerMethodArgumentResolver参数解析器(一)
- springMVC源码分析--HandlerMethodReturnValueHandlerComposite返回值解析器集合(二)