编写工具展示lucene分词内部分析过程
2016-03-31 17:53
465 查看
此代码工具可以展示:对应的语汇单元还有多个属性没有在代码中展示,可查看包org.apache.lucene.analysis.tokenattributes里所有的attribute
package com.liu.lucene.pro;
import java.io.IOException;
import java.io.Reader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
public class AnalyzerUtils {
public static void displayTokens(Analyzer analyzer,Reader reader){
try {
TokenStream tokenStream = analyzer.tokenStream("path", reader);
tokenStream.reset();
CharTermAttribute term = tokenStream.addAttribute(CharTermAttribute.class);
PositionIncrementAttribute posIncrAtt = tokenStream.addAttribute(PositionIncrementAttribute.class);
while(tokenStream.incrementToken()){
System.out.print("["+term.toString()+"]");
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
package com.liu.lucene.pro;
import java.io.IOException;
import java.io.Reader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;
public class AnalyzerUtils {
public static void displayTokens(Analyzer analyzer,Reader reader){
try {
TokenStream tokenStream = analyzer.tokenStream("path", reader);
tokenStream.reset();
CharTermAttribute term = tokenStream.addAttribute(CharTermAttribute.class);
PositionIncrementAttribute posIncrAtt = tokenStream.addAttribute(PositionIncrementAttribute.class);
while(tokenStream.incrementToken()){
System.out.print("["+term.toString()+"]");
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
相关文章推荐
- Spring快速入门
- Android初学习 - Broadcast Receiver的介绍02
- spark 1.1.0 编译使用 & 爬坑记录
- PTA 链表删除结点的题目测试
- HDU 1087 Super Jumping! Jumping! Jumping!(最大的上升子序列的和)(不是最长)(易混淆)
- 错误expression: invalid operator<
- 关于platform_driver 是如何匹配 platform_device的和如何调用到platform_driver中的probe函数的研究
- 编辑器笔记——sublime text3 编译sass
- win32收不到F10按键消息解决办法
- 构建执法阅读笔记4
- bootstrap表单带验证
- linux下搭建一个xampp环境进行性能测试
- Linux: apt-get vs aptitude and 'ldconfig' problem solving (libc-bin package)
- Linux: apt-get vs aptitude and 'ldconfig' problem solving (libc-bin package)
- Linux: apt-get vs aptitude and 'ldconfig' problem solving (libc-bin package)
- Linux: apt-get vs aptitude and 'ldconfig' problem solving (libc-bin package)
- Linux: apt-get vs aptitude and 'ldconfig' problem solving (libc-bin package)
- Linux: apt-get vs aptitude and 'ldconfig' problem solving (libc-bin package)
- Linux: apt-get vs aptitude and 'ldconfig' problem solving (libc-bin package)
- Android图片压缩(质量压缩和尺寸压缩)