您的位置:首页 > 其它

对solrcloud的认识和solrcloud的搭建

2015-10-30 19:11 351 查看


SolrCloud

solrcloud是集成了solr和zookeeper的优秀的分布式搜索引擎,他支持集中式配置文件,自动容错,查询时自动负载均衡等优秀特性。

solrcloud的几个概念

collection

collection在solrcloud可以看成是对所有document的索引,如果在一个zookeeper节点或者server上面觉得索引太臃肿或者查询耗时过多,可以将collection分成多个shards。

shards

shards是document的索引分片,solrcloud自动按你分片的数量来分发索引片断,一个shards又可以分成多个replica和一个leader,这个leader根据某个选举算法选出,当生成索引时,如果document是提交给了replica,那么replica会将请求转给同shrad的leader,然后再将document路由发给replica,如果leader宕机挂掉了,solrcloud会根据选举算法再从replica中选举产生leader。其实replica在某种意义上可以看成是shards的备份,增加了服务器的健壮性,这个replica也有点像solr4.0之前的master-slaver,但是在solrcloud中发生搜索时,replica也是可以分担搜索任务的。当一个shards的文档达到阈值时,shards是可以分裂的,用户将文档随意交给一个replica,replica转交给leader,leader最后交给新分裂的shards的leader,然后leader将文档分发给他的replica。

solrcloud5.3.1搭建

下载好jdk,zookeeper,solr安装包

jdk-8u45-linux-x64.tar.gz

zookeeper-3.4.6.tar.gz

solr-5.3.1.tgz

安装jdk,配置好环境变量

[root@localhost soft]# ls
jdk-8u45-linux-x64.tar.gz  solr-5.3.1  solr-5.3.1.tgz  zookeeper-3.4.6.tar.gz
[root@localhost soft]# cd
[root@localhost soft]# tar xf jdk-8u45-linux-x64.tar.gz
[root@localhost soft]# ls
jdk1.8.0_45  jdk-8u45-linux-x64.tar.gz  solr-5.3.1  solr-5.3.1.tgz  zookeeper-3.4.6.tar.gz
[root@localhost soft]#
[root@localhost soft]# vim /etc/profile
添加环境变量:export JAVA_HOME=/home/workspace/soft/jdk1.8.0_45
export PATH=$PATH:$JAVA_HOME/bin
[root@localhost soft]# source /etc/profile             //使配置文件生效


3.安装zookeeper集群
[root@localhost soft]# tar xf zookeeper-3.4.6.tar.gz
[root@localhost soft]# mv zookeeper-3.4.6 /home/workspace/
[root@localhost workspace]# ls
soft  solrcloud  zookeeper
[root@localhost workspace]# cd zookeeper/
[root@localhost zookeeper]# ls
data  zookeeper-3.4.6
[root@localhost zookeeper]# cd zookeeper-3.4.6/
[root@localhost zookeeper-3.4.6]# ls
bin          contrib          ivy.xml      README_packaging.txt  zookeeper-3.4.6.jar
build.xml    dist-maven       lib          README.txt            zookeeper-3.4.6.jar.asc
CHANGES.txt  docs             LICENSE.txt  recipes               zookeeper-3.4.6.jar.md5
conf         ivysettings.xml  NOTICE.txt   src                   zookeeper-3.4.6.jar.sha1
[root@localhost zookeeper-3.4.6]# cd conf
[root@localhost conf]# ls
configuration.xsl  log4j.properties  zoo.cfg  zoo_sample.cfg
[root@localhost conf]# cp zoo_sample.cfg zoo.cfg

# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/home/workspace/zookeeper/data            //data数据存放路径
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance #
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

//添加三台服务器
server.3=192.168.219.131:2888:3888
server.2=192.168.219.130:2888:3888
server.1=192.168.219.132:2888:3888


以上代码需要在三台服务器上分别配置

接下来再在不同的服务器上进入之前配置好的dataDir的data目录上新建myid文件写入服务器标识

比如如果是服务器3 server.3就需要在服务器三上的data目录下如下操作:
[root@localhost data]# echo "3" >myid


分别在三台服务器上写入myid

然后分别启动三台服务器
[root@localhost zookeeper-3.4.6]# cd bin
[root@localhost bin]# ./zkServer.sh start


测试集群搭建是否成功:
[root@localhost zookeeper-3.4.6]# bin/zkServer.sh status
JMX enabled by default
Using config: /home/workspace/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: leader
[root@localhost zookeeper-3.4.6]#


如果状态是如上所示那么说明搭建zookeeper集群成功,有可能有同学的mode属性是stardalone,这是默认单机,并不是集群,需要检查之前的配置文件是否配置错误。还有如果会报no route to host 错有可能是因为防火墙的问题,通过命令service iptables stop来关闭防火墙

4.安装solrcloud服务

到workspace的目录下,建立solrcloud/{data,solr}data和solr两个文件夹
[root@localhost workspace]# mkdir -p /home/workspace/solrcloud/{data,solr}    //-p是如果父目录如果没有就创建的意思
//接着到soft文件夹下解压solr并且安装服务
[root@localhost soft]# tar xf solr-5.3.1.tgz
[root@localhost soft]# ls
jdk1.8.0_45  jdk-8u45-linux-x64.tar.gz  solr-5.3.1  solr-5.3.1.tgz  zookeeper-3.4.6.tar.gz
[root@localhost soft]# cd solr-5.3.1/bin
[root@localhost bin]# ./install_solr_service.sh /home/workspace/soft/solr-5.3.1.tgz  -d /home/workspace/solrcloud/data/ -i /home/workspace/solrcloud/solr/ -s solrcloud -u root -p 8080
//这里面的-d表示将前面的solr_service的数据放到当前目录,-i是将solr解压到当前目录,-s是服务名称,-p是设置端口号,默认8983

[root@localhost workspace]# cd solrcloud/
[root@localhost solrcloud]# ls
data  solr
[root@localhost solrcloud]# cd solr
[root@localhost solr]# cd ..
[root@localhost solrcloud]# ls -al
total 16
drwxr-xr-x. 4 root root 4096 Oct 30 08:09 .
drwxr-xr-x. 5 root root 4096 Oct 30 08:09 ..
drwxr-xr-x. 4 root root 4096 Oct 30 09:26 data
drwxr-xr-x. 3 root root 4096 Oct 30 08:13 solr
[root@localhost solrcloud]# cd solr
[root@localhost solr]# ls -al
total 12
drwxr-xr-x. 3 root root 4096 Oct 30 08:13 .
drwxr-xr-x. 4 root root 4096 Oct 30 08:09 ..
drwxr-xr-x. 9 root root 4096 Oct 30 08:13 solr-5.3.1
lrwxrwxrwx. 1 root root   42 Oct 30 08:13 solrcloud -> /home/workspace/solrcloud/solr//solr-5.3.1
[root@localhost solr]# ls ../data
data  log4j.properties  logs  solr-8080.pid  solr.in.sh
[root@localhost solr]# vim solr.in.sh
[root@localhost solr]# cd ..
[root@localhost solrcloud]# ls
data  solr
[root@localhost solrcloud]# cd data
[root@localhost data]# ls
data  log4j.properties  logs  solr-8080.pid  solr.in.sh
[root@localhost data]# vim solr.in.sh
//修改zk属性
# Set the ZooKeeper connection string if using an external ZooKeeper ensemble
# e.g. host1:2181,host2:2181/chroot
# Leave empty if not using SolrCloud
#ZK_HOST=""
ZK_HOST="192.168.219.132:2181,192.168.219.130:2181,192.168.219.131:2181"
在每台服务器上都修改该属性
然后重启服务
[root@localhost data]# service solrcloud restart


最后创建collection:
[root@localhost solr-5.3.1]# ./server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd upconfig -confname demo-conf -confdir server/solr/configsets/basic_configs/conf/

[root@localhost solr-5.3.1]# ./server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd linkconfig -collection demo -confname demo-conf
[root@localhost solr-5.3.1]# curl 'http://192.168.219.132:8080/solr/admin/collections?action=CREATE&name=demo&numShards=3&replicationFactor=1'

这里有个规定,numshards*replicaFactor<livesolrNode*maxshardsPerNode
maxshardsPerNode默认为1


按以上操作就可以有简单的搜索了,但是要想更高级的搜索还需要将文档上传和定义分词器还有定义schema配置。

最后提供我的目录结构:

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  zookeeper solrcloud