您的位置：首页 > 其它

Hive 基本命令

2016-11-15 12:24 267 查看

本文简单介绍Hive CLI下面一些常用的HQL语句，如建表，删表，导入数据等等。

在执行这个HQL语句之前先进行Hive CLI，如下。当然了，使用Hive CLI的前提条件是你的环境里面已经安装了Hadoop/Hive相关组件，

[centos@cent-2 ~]$ hive
16/11/15 11:32:11 WARN conf.HiveConf: HiveConf of name hive.optimize.mapjoin.mapreduce does not exist
16/11/15 11:32:11 WARN conf.HiveConf: HiveConf of name hive.heapsize does not exist
16/11/15 11:32:11 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
16/11/15 11:32:11 WARN conf.HiveConf: HiveConf of name hive.auto.convert.sortmerge.join.noconditionaltask does not exist

Logging initialized using configuration in file:/etc/hive/conf/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hive/lib/hive-jdbc-0.14.0.2.2.0.0-2041-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
hive>

1 创建简单表 –CREATE TABLE

hive> create table test1(columna int, columnb string);
OK
Time taken: 0.349 seconds

2 显示所有表 –SHOW TABLES {regexp(tablename)}

hive> show tables;
OK
test1
ttest1
Time taken: 0.023 seconds, Fetched: 2 row(s)
hive> show tables 'tt*';
OK
ttest1
Time taken: 0.026 seconds, Fetched: 1 row(s)

3 显示所有数据库 –SHOW DATABASES

hive> show databases;
OK
default
Time taken: 0.015 seconds, Fetched: 1 row(s)

4 查看表结构 –DESCRIBE/DESC tablename

hive> describe test1;
OK
columna                 int
columnb                 string
Time taken: 1.497 seconds, Fetched: 2 row(s)

5 修改表 –ALTER TABLE

hive> alter table test1 rename to test2;
OK
Time taken: 0.177 seconds
hive> show tables;
OK
test2
Time taken: 0.045 seconds, Fetched: 1 row(s)
hive> alter table test2 add columns(columnc string);
OK
Time taken: 0.129 seconds
hive> desc test2;
OK
columna                 int
columnb                 string
columnc                 string
Time taken: 0.097 seconds, Fetched: 3 row(s)

6 删除表 –DROP TABLE

hive> drop table test2;
OK
Time taken: 0.351 seconds
hive> show tables;
OK
Time taken: 0.023 seconds

7 导入数据 –LOAD DATA {LOCAL} INPATH … {OVERWRITE} INTO TABLE tablename

(注：LOCAL表示从本地导入，去掉表示从HDFS导入；OVERWRITE表示覆盖表原有数据，去掉表示 APPEND)

[hdfs@cent-2 ~]$ hadoop fs -cat /user/hive/test.txt
1,'AAA'
2,'BBB'
3,'CCC'

hive> load data inpath '/user/hive/test.txt' overwrite into table test1;
Loading data to table default.test1
Table default.test1 stats: [numFiles=1, numRows=0, totalSize=24, rawDataSize=0]
OK
Time taken: 1.74 seconds

8 查询数据(转换为MapReduce) –SELECT

hive> select count(*) from test1;
Query ID = hdfs_20161115120101_3de3af75-ca9c-4cec-892b-559f5af6f313
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1479180357223_0001, Tracking URL = http://cent-2.novalocal:8088/proxy/application_1479180357223_0001/ Kill Command = /usr/hdp/2.2.0.0-2041/hadoop/bin/hadoop job  -kill job_1479180357223_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2016-11-15 12:01:35,574 Stage-1 map = 0%,  reduce = 0%
2016-11-15 12:01:44,187 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.08 sec
2016-11-15 12:01:50,553 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.77 sec
MapReduce Total cumulative CPU time: 3 seconds 770 msec
Ended Job = job_1479180357223_0001
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 3.77 sec   HDFS Read: 241 HDFS Write: 2 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 770 msec
OK
3
Time taken: 34.68 seconds, Fetched: 1 row(s)

9 创建表，指定分隔符 –FIELDS TERMINATED BY

hive> create table test2(columna int, columnb string)
> row format delimited fields terminated by ',';
OK
Time taken: 0.115 seconds

10 创建分区表 –PARTITIONED BY

hive> create table test3(columna int, columnb string)
> partitioned by (dt string);
OK
Time taken: 0.216 seconds
hive> describe test3;
OK
columna                 int
columnb                 string
dt                      string

# Partition Information
# col_name              data_type               comment

dt                      string
Time taken: 0.097 seconds, Fetched: 8 row(s)

11 新增分区 –ADD PARTITION

hive> alter table test4 add partition(dt='201601');
OK
Time taken: 0.194 seconds
hive> alter table test4 add partition(dt='201602');
OK
Time taken: 0.09 seconds
hive> alter table test4 add partition(dt='201603');
OK
Time taken: 0.084 seconds
hive> show partitions test4;
OK
dt=201601
dt=201602
dt=201603
Time taken: 0.096 seconds, Fetched: 3 row(s)

12 导入数据到固定分区 –LOAD DATA …PARTITION …

hive> load data local inpath '/home/hdfs/test.txt' into table test4 partition(dt='201603');
Loading data to table default.test4 partition (dt=201603)
Partition default.test4{dt=201603} stats: [numFiles=1, totalSize=40]
OK
Time taken: 0.879 seconds

13 创建外部表 –CREATE EXTERNAL TABLE

hive> create external table ext_table(columna int, columnb string, columnc string)
> row format delimited
> fields terminated by ','
> location '/user/hive';
OK
Time taken: 0.103 seconds
hive> select * from ext_table;
OK
1       HHH     201601
2       JJJ     201602
3       KKK     201603
NULL    NULL    NULL
Time taken: 0.055 seconds, Fetched: 4 row(s)

14 查看表定义 –SHOW CREATE TABLE

hive> show create table default.eboxdata;
OK
CREATE EXTERNAL TABLE `default.eboxdata`(
`ctime` string,
`mac` string,
`addr` int,
`title` string,
`o_c` smallint,
`enable_net_ctrl` smallint,
`alarm` int,
`model` string,
`specification` string,
`version` string,
`a_a` float,
`a_ld` float,
`a_t` float,
`a_v` float,
`a_w` float,
`power` float,
`mxdw` float,
`mxgg` float,
`mxgl` float,
`mxgw` float,
`mxgy` float,
`mxld` float,
`mxqy` float,
`control` smallint,
`visibility` smallint)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://n11.trafodion.local:8020/bulkload/EBOXDATA'
TBLPROPERTIES (
'transient_lastDdlTime'='1479781899')
Time taken: 0.158 seconds, Fetched: 36 row(s)

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航