Elasticsearch- 分词查询
2016-03-10 16:05
555 查看
查看分词的命令, ES配置完成后需要测试分词,看看分词是否达到预期效果。
curl 命令查看:
1. 使用自定义的分析器查看分词:ansj_index_synonym:自定交分析器名称. pretty :json格式显示
[plain] view plaincopy
curl -XGET 'http://localhost:8200/zh/_analyze?analyzer=ansj_index_synonym&pretty' -d '童装童鞋'
2. 使用自定义的分词器(tokenizer)和过滤器(filters)查看分词:
[plain] view plaincopy
curl -XGET 'http://localhost:8200/zh/_analyze?tokenizer=ansj_index&filters=synonym&pretty' -d '童装童鞋'
3. 查询某个字段的分词:
[plain] view plaincopy
curl -XGET 'http://localhost:8200/zh/_analyze?field=brand_name&pretty' -d '童装童鞋'
“brand_name”:字段名称,如果是字段是nest,object类型,也可以写成"brand_name. name"
除了自定义自己的分析器,ES自己也有内置分析器如:
standard
simple
whitespace
stop
keyword
pattern
language
snowball
custom
具体解释:http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-standard-analyzer.html
需要英文好点在同鞋。
ES还内置了分词器和过滤器: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-tokenizers.htmlstandard edge_ngram
keyword
letter
lowercase
ngram
whitespace
pattern
uax_email_url
path_hierarchy
ascii folding
length
lowercase
uppercase
nGram
edge_ngram
porter_stem
shingle
stop
word_delimiter
stemmer
stemmer_override
keyword_marker
keyword_repeat
kstem
snowball
phonetic
synonym
reverse
elision
truncate
unique
pattern_capture
pattern_replace
trim
limit
hunspell
common_grams
normalization
delimited_payload
keep_words
elasticsearch-analysis-mmseg
https://github.com/medcl/elasticsearch-analysis-mmseg
基于 http://code.google.com/p/mmseg4j/
elasticsearch-analysis-jieba
https://github.com/huaban/elasticsearch-analysis-jieba
elasticsearch-analysis-ansj
https://github.com/4onni/elasticsearch-analysis-ansj
elasticsearch-analysis-ik
https://github.com/medcl/elasticsearch-analysis-ik
elasticsearch-analysis-paoding https://github.com/medcl/elasticsearch-analysis-paoding
中文分词推荐用ik,mmseg,这两个分词器有更新。
ansj,paoding分词器很久没更新了,没有对应比较高的es版本。
/article/8721262.html
curl 命令查看:
1. 使用自定义的分析器查看分词:ansj_index_synonym:自定交分析器名称. pretty :json格式显示
[plain] view plaincopy
curl -XGET 'http://localhost:8200/zh/_analyze?analyzer=ansj_index_synonym&pretty' -d '童装童鞋'
2. 使用自定义的分词器(tokenizer)和过滤器(filters)查看分词:
[plain] view plaincopy
curl -XGET 'http://localhost:8200/zh/_analyze?tokenizer=ansj_index&filters=synonym&pretty' -d '童装童鞋'
3. 查询某个字段的分词:
[plain] view plaincopy
curl -XGET 'http://localhost:8200/zh/_analyze?field=brand_name&pretty' -d '童装童鞋'
“brand_name”:字段名称,如果是字段是nest,object类型,也可以写成"brand_name. name"
除了自定义自己的分析器,ES自己也有内置分析器如:
standard
simple
whitespace
stop
keyword
pattern
language
snowball
custom
具体解释:http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-standard-analyzer.html
需要英文好点在同鞋。
ES还内置了分词器和过滤器: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-tokenizers.htmlstandard edge_ngram
keyword
letter
lowercase
ngram
whitespace
pattern
uax_email_url
path_hierarchy
ascii folding
length
lowercase
uppercase
nGram
edge_ngram
porter_stem
shingle
stop
word_delimiter
stemmer
stemmer_override
keyword_marker
keyword_repeat
kstem
snowball
phonetic
synonym
reverse
elision
truncate
unique
pattern_capture
pattern_replace
trim
limit
hunspell
common_grams
normalization
delimited_payload
keep_words
elasticsearch-analysis-mmseg
https://github.com/medcl/elasticsearch-analysis-mmseg
基于 http://code.google.com/p/mmseg4j/
elasticsearch-analysis-jieba
https://github.com/huaban/elasticsearch-analysis-jieba
elasticsearch-analysis-ansj
https://github.com/4onni/elasticsearch-analysis-ansj
elasticsearch-analysis-ik
https://github.com/medcl/elasticsearch-analysis-ik
elasticsearch-analysis-paoding https://github.com/medcl/elasticsearch-analysis-paoding
中文分词推荐用ik,mmseg,这两个分词器有更新。
ansj,paoding分词器很久没更新了,没有对应比较高的es版本。
/article/8721262.html
相关文章推荐
- 界面原型创建工具Axure使用教程
- java rmi 两种方式 固定端口设置
- pro*c 动态sql一
- Spark与Flink:对比与分析
- zookeeper 理论
- htmlunit 发http请求
- pro*c 动态sql
- JVM指令详解(上)
- 一致性哈希算法(consistent hashing)
- Maven 仓库、镜像
- JS正则表达式获取分组内容实例
- An internal error occurred during: "Updating Maven Project".
- RMI 两个端口
- [置顶] SpringSecurity 源码分析一
- ElasticSearch性能优化方案
- Oracle--分区表(范围分区、Hash分区、等)
- Java多线程 阻塞队列和并发集合
- ZooKeeper编程(一)
- 或许被我们遗忘的JAVA Math类
- 使用Maven构建项目