您的位置:首页 > 数据库 > MySQL

Hive基于MySQL保存元数据的安装

2016-08-28 14:09 429 查看
Hive下载

Hive官方网站:http://hive.apache.org/

Hive官方下载:http://hive.apache.org/downloads.html

Apache归档:Apache Software Foundation Distribution Directory

本次下载版本:apache-hive-0.13.1-bin.tar.gz

解压Hive

$ tar zxvf apache-hive-0.13.1-bin.tar.gz -C /opt/modules/
$ cd /opt/modules/
$ mv apache-hive-0.13.1-bin/ hive-0.13.1


配置Hive

$ cd /opt/modules/hive-0.13.1/conf
$ cp hive-env.sh.template hive-env.sh


编辑hive-env.sh修改如下两行代码

$ vim hive-env.sh
# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/opt/modules/hadoop-2.5.0
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/opt/modules/hive-0.13.1/conf


验证Hive

运行Hive之前,先启动Hadoop,需要在HDFS上创建/tmp和/user/hive/warehouse文件夹,并需要给新创建的文件夹写权限,如下代码所示:

$ cd /opt/modules/hadoop-2.5.0/
$ bin/hdfs dfs -mkdir /tmp
$ bin/hdfs dfs -mkdir -p /user/hive/warehouse
$ bin/hdfs dfs -chmod g+w /tmp
$ bin/hdfs dfs -chmod g+w /user/hive/warehouse


至此Hive内嵌模式已经安装完成,如下命令来验证hive安装:

$ cd /opt/modules/hive-0.13.1/
$ bin/hive


如下信息表示Hive内嵌模式安装成功。

Logging initialized using configuration in jar:file:/opt/modules/hive-0.13.1/lib/hive-common-0.13.1.jar!/hive-log4j.properties
hive> show databases;
OK
default
Time taken: 0.576 seconds, Fetched: 1 row(s)


MySQL保存元数据

下载MySQL源

$ wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm[/code] 
安装mysql-community-release-el7-5.noarch.rpm包

$ sudo rpm -ivh mysql-community-release-el7-5.noarch.rpm


安装mysql

$ sudo yum install -y mysql-server


启动MySQL

$ sudo service mysqld start


配置MySQL开机启动

$ sudo chkconfig mysqld on


设置MySQL root密码

$ mysqladmin -u root password 'hive'


登录MySQL

$ mysql -uroot -p


配置远程登录

mysql> grant all privileges on *.* to 'root'@'%' identified by 'hive' with grant option;


删除原用户信息

mysql> use mysql
mysql> delete from user where host='localhost' and user='root';


最后只剩如下root记录

mysql> select host, user, password from user;
+------+------+-------------------------------------------+
| host | user | password                                  |
+------+------+-------------------------------------------+
| %    | root | *4DF1D66463C18D44E3B001A8FB1BBFBEA13E27FC |
+------+------+-------------------------------------------+


重启MySQL服务

mysql> quit;
$ sudo service mysqld restart


配置Hive使用MySQL存储

$ cd /opt/modules/hive-0.13.1/
$ cp conf/hive-default.xml.template conf/hive-site.xml


修改hive-site.xml文件

$ vim conf/hive-site.xml

<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop01.malone.com:3306/metastore?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>

<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>

<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<description>password to use against metastore database</description>
</property>
</configuration>


导入MySQL驱动包

$ mv mysql-connector-java-5.1.27-bin.jar /opt/modules/hive-0.13.1/lib/


HQL语句测试

$ cd /opt/modules/hive-0.13.1/
$ bin/hive
hive> show databases;
OK
default
Time taken: 1.418 seconds, Fetched: 1 row(s)
hive> create database if not exists hive_testdb;
OK
Time taken: 1.084 seconds
hive> use hive_testdb;
OK
Time taken: 0.027 seconds
hive> show tables;
OK
Time taken: 0.029 seconds
hive> create table employee(id int, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
OK
Time taken: 1.542 seconds
hive> load data local inpath '/opt/datas/hive/employee.txt' into table employee;
Copying data from file:/opt/datas/hive/employee.txt
Copying file: file:/opt/datas/hive/employee.txt
Loading data to table hive_testdb.employee
Table hive_testdb.employee stats: [numFiles=1, numRows=0, totalSize=52, rawDataSize=0]
OK
Time taken: 1.939 seconds
hive> desc employee;
OK
id int
name string
Time taken: 0.185 seconds, Fetched: 2 row(s)
hive> desc extended employee;
OK
id int
name string

Detailed Table Information Table(tableName:employee, dbName:hive_testdb, owner:hadoop, createTime:1472398263, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id, type:int, comment:null), FieldSchema(name:name, type:string, comment:null)], location:hdfs://hadoop01.malone.com:8020/user/hive/warehouse/hive_testdb.db/employee, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format= , field.delim=
Time taken: 0.161 seconds, Fetched: 4 row(s)
hive> desc formatted employee;
OK
# col_name data_type comment

id int
name string

# Detailed Table Information
Database: hive_testdb
Owner: hadoop
CreateTime: Sun Aug 28 23:31:03 CST 2016
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://hadoop01.malone.com:8020/user/hive/warehouse/hive_testdb.db/employee
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE true
numFiles 1
numRows 0
rawDataSize 0
totalSize 52
transient_lastDdlTime 1472398294

# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim \t
serialization.format \t
Time taken: 0.264 seconds, Fetched: 33 row(s)
hive> select * from employee;
OK
1 burce.lee
2 jacky.chen
3 elbert.malone
4 andy.lau
Time taken: 0.817 seconds, Fetched: 4 row(s)
hive> select id from employee;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1472391663133_0001, Tracking URL = http://hadoop01.malone.com:8088/proxy/application_1472391663133_0001/ Kill Command = /opt/modules/hadoop-2.5.0/bin/hadoop job -kill job_1472391663133_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2016-08-28 23:35:16,716 Stage-1 map = 0%, reduce = 0%
2016-08-28 23:35:50,749 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.84 sec
MapReduce Total cumulative CPU time: 1 seconds 840 msec
Ended Job = job_1472391663133_0001
MapReduce Jobs Launched:
Job 0: Map: 1 Cumulative CPU: 1.84 sec HDFS Read: 294 HDFS Write: 8 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 840 msec
OK
1
2
3
4
Time taken: 86.453 seconds, Fetched: 4 row(s)


Hive常用属性配置

cli命令行显示数据库名称和列标题名称

$ cd /opt/modules/hive-0.13.1/
$ vim conf/hive-site.xml


新增如下配置信息

<property>
<name>hive.cli.print.header</name>
<value>true</value>
<description>Whether to print the names of the columns in query output.</description>
</property>

<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
<description>Whether to include the current database in the Hive prompt.</description>
</property>


修改后的效果

$ bin/hive

Logging initialized using configuration in jar:file:/opt/modules/hive-0.13.1/lib/hive-common-0.13.1.jar!/hive-log4j.properties
hive (default)> show databases;
OK
database_name
default
hive_testdb
Time taken: 0.768 seconds, Fetched: 2 row(s)
hive (default)> use hive_testdb;
OK
Time taken: 0.028 seconds
hive (hive_testdb)> show tables;
OK
tab_name
employee
Time taken: 0.063 seconds, Fetched: 1 row(s)
hive (hive_testdb)> select * from employee;
OK
employee.id employee.name
1   burce.lee
2   jacky.chen
3   elbert.malone
4   andy.lau
Time taken: 0.917 seconds, Fetched: 4 row(s)


配置Hive的日志信息

$ cd /opt/modules/hive-0.13.1/conf
$ cp hive-log4j.properties.template hive-log4j.properties
$ vim hive-log4j.properties


修改如下信息

# Define some default values that can be overridden by system properties
hive.log.threshold=ALL
hive.root.logger=INFO,DRFA
hive.log.dir=/opt/modules/hive-0.13.1/logs
hive.log.file=hive.log
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  hive