Elasticsearch使用笔记
2016-04-21 18:32
309 查看
说明
这是ElasticSearch使用过程中的一些常用命令记要,简单整理了下,方便以后翻阅。配置文件选项
### config/elasticsearch.yml几个属性 #集群名称 cluster.name: mbdt_cluster #节点名称 node.name: xxx1 #该参数用于同时设置bind_host和publish_host network.host: xxx1 #设置集群中的Master节点的初始列表,可以通过这些节点来自动发现其他新加入集群的节点 discovery.zen.ping.unicast.hosts: ["xxx1:9300","xxx2:9300","xxx3:9300"] #开启动态脚本 script.inline: on script.indexed: on script.file: on
集群查看
#启动命令, -d 后台启动 -Xms直接设置java启动参数 bin/elasticsearch #可以安装elasticsearch—head来提供界面化操作 #查看集群节点 curl 'xxx1:9200/_cluster/state/nodes/?pretty' curl 'http://10.205.17.140:9200/_nodes?pretty' #查看集群健康 curl -XGET http://10.205.17.141:9200/_cluster/health?pretty #关掉整个集群: curl -XPOST http://localhost:9200/_cluster/nodes/_shutdown #为关闭单一节点,假如节点标识符是BlrmMvBdSKiCeYGsiHijdg,可以执行下面的命令: curl –XPOST http://localhost:9200/_cluster/nodes/BlrmMvBdSKiCeYGsiHijdg/_shutdown #节点的标识符可以在日志中看到,或者使用_cluster/nodes API curl -XGET http://localhost:9200/_cluster/nodes/ #注意put与post的区别,post是创建,put是创建或修改,指定资源 curl -XPUT http://localhost:9200/blog/article/1 -d '{"title": "New version of Elasticsearch released!", content": "Version 1.0 released today!", "tags": ["announce", "elasticsearch", "release"] }'
文档增删改查
#检索文档 curl -XGET http://mbdt3:9200/blog/article/1?pretty #创建Indix curl -XPUT 'mbdt1:9200/customer?pretty' #列出Indix curl 'mbdt2:9200/_cat/indices?v' #更新document,doc指定的是文档内容 curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d ' { "doc": { "name": "Jane Doe" } }' #执行动态脚本 curl -XPOST 'localhost:9200/customer/external/1/_update?pretty' -d ' { "script" : "ctx._source.age += 5" }' #"scripts of type [inline], operation [update] and lang [groovy] are disabled" #有可能出现上面的错误, Elasticsearch开启groovy动态语言支持,相关链接:http://www.du52.com/text.php?id=629 ### 完全开启 #编辑```config/elasticsearch.yml```文件,在最后添加以下代码 script.inline: on script.indexed: on script.file: on ### 沙盒中开启 #编辑```config/elasticsearch.yml```文件,在最后添加以下代码 script.inline: sandbox script.indexed: sandbox script.file: on #删除文档 curl -XDELETE 'localhost:9200/customer/external/2?pretty' ##批量操作 #创建 curl -XPOST 'localhost:9200/customer/external/_bulk?pretty' -d ' {"index":{"_id":"1"}} {"name": "John Doe" } {"index":{"_id":"2"}} {"name": "Jane Doe" } ' #用create来创建 curl -XPOST 'mbdt1:9200/_bulk?pretty' -d ' { "create": { "_index": "index1", "_type": "resource", "_id": 13 } } { "title": "....." } ' ' #多操作 curl -XPOST 'localhost:9200/customer/external/_bulk?pretty' -d ' {"update":{"_id":"1"}} {"doc": { "name": "John Doe becomes Jane Doe" } } {"delete":{"_id":"2"}} ' #本地文件批量导入 curl -XPOST 'localhost:9200/bank/account/_bulk?pretty' --data-binary "@accounts.json" curl 'localhost:9200/_cat/indices?v' #查询,全匹配 curl -XPOST 'localhost:9200/bank/_search?pretty' -d ' { "query": { "match_all": {} } }' #排序 curl -XPOST 'mbdt1:9200/bank/_search?pretty' -d ' { "query": { "match_all": {} }, "_source": ["account_number", "balance"], "sort": {"balance": {"order":"desc"}}, "from": 10, "size":3 }' #字段匹配 curl -XGET mbdt2:9200/z_wo_order/record/_search?pretty -d '{ "query": { "match": { "cust_name": { "query": "黄利欢", "operator": "and" } } } }' #组合查询 GET /my_index/my_type/_search { "query": { "bool": { "must": { "match": { "title": "quick" }}, "must_not": { "match": { "title": "lazy" }}, "should": [ { "match": { "title": "brown" }}, { "match": { "title": "dog" }} ] } } } #控制精度 GET /my_index/my_type/_search { "query": { "match": { "title": { "query": "quick brown dog", "minimum_should_match": "75%" } } } }
mapping设置
#查看analyzie #中文分词可以使安装ik插件 curl -XGET 'mbdt1:9200/index1/_analyze?pretty=true&analyzer=ik' -d "联想召回笔记本电源线" #mapping设置 curl XPUT mbdt2:9200/index2?pretty -d ' { "settings": { "refresh_interval": "5s", "number_of_shards" : 1, "number_of_replicas" : 0 }, "mappings": { "_default_":{ "_all": { "enabled": false } }, "resource": { "dynamic": false, "properties": { "title": { "type": "string", //属性类型 "index": "analyzed", //是否可分词 "fields": { "cn": { "type": "string", "analyzer": "ik" //选择分词器 }, "en": { "type": "string", "analyzer": "english" } } } } } } }'
数据导入
#调用elasticsearch-hadoop中的spark模块导入数据 命令: ./spark-shell --jars /home/ouguangneng/elasticsearch-hadoop-2.2.0/dist/elasticsearch-spark_2.10-2.2.0.jar --conf spark.es.nodes=xxx2 --conf spark.es.port=9200 --master yarn-client --num-executors 10 --driver-memory 4g --executor-memory 3g --executor-cores 4 #基于RDD导入数据 import org.elasticsearch.spark._ val numbers = Map("one" -> 1, "two" -> 2, "three" -> 3) val airports = Map("arrival" -> "Otopeni", "SFO" -> "San Fran") sc.makeRDD(Seq(numbers, airports)).saveToEs("test/ext") #基于RDD导入数据,并指定document id import org.elasticsearch.spark._ import org.elasticsearch.spark.rdd.EsSpark case class Trip(oid: String, departure: String, arrival: String) val upcomingTrip = Trip("1", "OTP", "SFO") val lastWeekTrip = Trip("2", "MUC", "OTP") val rdd = sc.makeRDD(Seq(upcomingTrip, lastWeekTrip)) EsSpark.saveToEs(rdd, "test/ext", Map("es.mapping.id" -> "oid")) #基于spark sql的dataframe导入数据 import org.elasticsearch.spark.sql._ import org.apache.spark.sql.hive.HiveContext val hiveContext = new HiveContext(sc) val df = sql("select * from tmp.z_wo_order limit 50") df.saveToEs("z_wo_order/record", Map("es.mapping.id" -> "order_id"))
相关文章推荐
- 详解HDFS Short Circuit Local Reads
- Spark RDD API详解(一) Map和Reduce
- 使用spark和spark mllib进行股票预测
- Hadoop_2.1.0 MapReduce序列图
- 使用Hadoop搭建现代电信企业架构
- Spark随谈——开发指南(译)
- 单机版搭建Hadoop环境图文教程详解
- Spark,一种快速数据分析替代方案
- 巧用mysql提示符prompt清晰管理数据库的方法
- hadoop常见错误以及处理方法详解
- 两大步骤教您开启MySQL 数据库远程登陆帐号的方法
- hadoop 单机安装配置教程
- hadoop的hdfs文件操作实现上传文件到hdfs
- hadoop实现grep示例分享
- phpmyadmin 4+ 访问慢的解决方法
- linux系统下实现mysql热备份详细步骤(mysql主从复制)
- CentOS 5.5下安装MySQL 5.5全过程分享
- MySQL复制的概述、安装、故障、技巧、工具(火丁分享)
- MySQL中删除重复数据的简单方法
- Apache Hadoop版本详解