您的位置：首页 > 数据库 > MySQL

Hive基于MySQL保存元数据的安装

2016-08-28 14:09 429 查看

Hive下载

Hive官方网站：http://hive.apache.org/

Hive官方下载：http://hive.apache.org/downloads.html

Apache归档：Apache Software Foundation Distribution Directory

本次下载版本：apache-hive-0.13.1-bin.tar.gz

解压Hive

$ tar zxvf apache-hive-0.13.1-bin.tar.gz -C /opt/modules/
$ cd /opt/modules/
$ mv apache-hive-0.13.1-bin/ hive-0.13.1

配置Hive

$ cd /opt/modules/hive-0.13.1/conf
$ cp hive-env.sh.template hive-env.sh

编辑hive-env.sh修改如下两行代码

$ vim hive-env.sh
# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/opt/modules/hadoop-2.5.0
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/opt/modules/hive-0.13.1/conf

验证Hive

运行Hive之前，先启动Hadoop，需要在HDFS上创建/tmp和/user/hive/warehouse文件夹，并需要给新创建的文件夹写权限，如下代码所示：

$ cd /opt/modules/hadoop-2.5.0/
$ bin/hdfs dfs -mkdir /tmp
$ bin/hdfs dfs -mkdir -p /user/hive/warehouse
$ bin/hdfs dfs -chmod g+w /tmp
$ bin/hdfs dfs -chmod g+w /user/hive/warehouse

至此Hive内嵌模式已经安装完成，如下命令来验证hive安装：

$ cd /opt/modules/hive-0.13.1/
$ bin/hive

如下信息表示Hive内嵌模式安装成功。

Logging initialized using configuration in jar:file:/opt/modules/hive-0.13.1/lib/hive-common-0.13.1.jar!/hive-log4j.properties
hive> show databases;
OK
default
Time taken: 0.576 seconds, Fetched: 1 row(s)

MySQL保存元数据

下载MySQL源

$ wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm[/code] 
安装mysql-community-release-el7-5.noarch.rpm包

$ sudo rpm -ivh mysql-community-release-el7-5.noarch.rpm


安装mysql

$ sudo yum install -y mysql-server


启动MySQL

$ sudo service mysqld start


配置MySQL开机启动

$ sudo chkconfig mysqld on


设置MySQL root密码

$ mysqladmin -u root password 'hive'


登录MySQL

$ mysql -uroot -p


配置远程登录

mysql> grant all privileges on *.* to 'root'@'%' identified by 'hive' with grant option;


删除原用户信息

mysql> use mysql
mysql> delete from user where host='localhost' and user='root';


最后只剩如下root记录

mysql> select host, user, password from user;
+------+------+-------------------------------------------+
| host | user | password                                  |
+------+------+-------------------------------------------+
| %    | root | *4DF1D66463C18D44E3B001A8FB1BBFBEA13E27FC |
+------+------+-------------------------------------------+


重启MySQL服务

mysql> quit;
$ sudo service mysqld restart


配置Hive使用MySQL存储

$ cd /opt/modules/hive-0.13.1/
$ cp conf/hive-default.xml.template conf/hive-site.xml


修改hive-site.xml文件

$ vim conf/hive-site.xml

<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop01.malone.com:3306/metastore?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>

<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>

<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<description>password to use against metastore database</description>
</property>
</configuration>


导入MySQL驱动包

$ mv mysql-connector-java-5.1.27-bin.jar /opt/modules/hive-0.13.1/lib/


HQL语句测试

$ cd /opt/modules/hive-0.13.1/
$ bin/hive
hive> show databases;
OK
default
Time taken: 1.418 seconds, Fetched: 1 row(s)
hive> create database if not exists hive_testdb;
OK
Time taken: 1.084 seconds
hive> use hive_testdb;
OK
Time taken: 0.027 seconds
hive> show tables;
OK
Time taken: 0.029 seconds
hive> create table employee(id int, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
OK
Time taken: 1.542 seconds
hive> load data local inpath '/opt/datas/hive/employee.txt' into table employee;
Copying data from file:/opt/datas/hive/employee.txt
Copying file: file:/opt/datas/hive/employee.txt
Loading data to table hive_testdb.employee
Table hive_testdb.employee stats: [numFiles=1, numRows=0, totalSize=52, rawDataSize=0]
OK
Time taken: 1.939 seconds
hive> desc employee;
OK
id                      int
name                    string
Time taken: 0.185 seconds, Fetched: 2 row(s)
hive> desc extended employee;
OK
id                      int
name                    string

Detailed Table Information  Table(tableName:employee, dbName:hive_testdb, owner:hadoop, createTime:1472398263, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id, type:int, comment:null), FieldSchema(name:name, type:string, comment:null)], location:hdfs://hadoop01.malone.com:8020/user/hive/warehouse/hive_testdb.db/employee, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=  , field.delim=
Time taken: 0.161 seconds, Fetched: 4 row(s)
hive> desc formatted employee;
OK
# col_name              data_type               comment

id                      int
name                    string

# Detailed Table Information
Database:               hive_testdb
Owner:                  hadoop
CreateTime:             Sun Aug 28 23:31:03 CST 2016
LastAccessTime:         UNKNOWN
Protect Mode:           None
Retention:              0
Location:               hdfs://hadoop01.malone.com:8020/user/hive/warehouse/hive_testdb.db/employee
Table Type:             MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE   true
numFiles                1
numRows                 0
rawDataSize             0
totalSize               52
transient_lastDdlTime   1472398294

# Storage Information
SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat:            org.apache.hadoop.mapred.TextInputFormat
OutputFormat:           org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed:             No
Num Buckets:            -1
Bucket Columns:         []
Sort Columns:           []
Storage Desc Params:
field.delim             \t
serialization.format    \t
Time taken: 0.264 seconds, Fetched: 33 row(s)
hive> select * from employee;
OK
1   burce.lee
2   jacky.chen
3   elbert.malone
4   andy.lau
Time taken: 0.817 seconds, Fetched: 4 row(s)
hive> select id from employee;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1472391663133_0001, Tracking URL = http://hadoop01.malone.com:8088/proxy/application_1472391663133_0001/ Kill Command = /opt/modules/hadoop-2.5.0/bin/hadoop job  -kill job_1472391663133_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2016-08-28 23:35:16,716 Stage-1 map = 0%,  reduce = 0%
2016-08-28 23:35:50,749 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.84 sec
MapReduce Total cumulative CPU time: 1 seconds 840 msec
Ended Job = job_1472391663133_0001
MapReduce Jobs Launched:
Job 0: Map: 1   Cumulative CPU: 1.84 sec   HDFS Read: 294 HDFS Write: 8 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 840 msec
OK
1
2
3
4
Time taken: 86.453 seconds, Fetched: 4 row(s)


Hive常用属性配置

cli命令行显示数据库名称和列标题名称

$ cd /opt/modules/hive-0.13.1/
$ vim conf/hive-site.xml


新增如下配置信息

<property>
<name>hive.cli.print.header</name>
<value>true</value>
<description>Whether to print the names of the columns in query output.</description>
</property>

<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
<description>Whether to include the current database in the Hive prompt.</description>
</property>


修改后的效果

$ bin/hive

Logging initialized using configuration in jar:file:/opt/modules/hive-0.13.1/lib/hive-common-0.13.1.jar!/hive-log4j.properties
hive (default)> show databases;
OK
database_name
default
hive_testdb
Time taken: 0.768 seconds, Fetched: 2 row(s)
hive (default)> use hive_testdb;
OK
Time taken: 0.028 seconds
hive (hive_testdb)> show tables;
OK
tab_name
employee
Time taken: 0.063 seconds, Fetched: 1 row(s)
hive (hive_testdb)> select * from employee;
OK
employee.id employee.name
1   burce.lee
2   jacky.chen
3   elbert.malone
4   andy.lau
Time taken: 0.917 seconds, Fetched: 4 row(s)


配置Hive的日志信息

$ cd /opt/modules/hive-0.13.1/conf
$ cp hive-log4j.properties.template hive-log4j.properties
$ vim hive-log4j.properties


修改如下信息

# Define some default values that can be overridden by system properties
hive.log.threshold=ALL
hive.root.logger=INFO,DRFA
hive.log.dir=/opt/modules/hive-0.13.1/logs
hive.log.file=hive.log

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： hive

相关文章推荐

新的分享

章节导航