您的位置:首页 > 其它

[转]Hive/Beeline 使用笔记

2015-07-22 09:59 344 查看
FROM : http://www.7mdm.com/1407.html

Hive:

利用squirrel-sql 连接hive

add driver -> name&example url(jdbc:hive2://xxx:10000)->extra class path ->Add

{hive/lib/hive-common-*.jar

hive/lib/hive-contrib-*.jar

hive/lib/hive-jdbc-*.jar

hive/lib/libthrift-*.jar

hive/lib/hive-service-*.jar

hive/lib/httpclient-*.jar

hive/lib/httpcore-*.jar

hadoop/share/hadoop/common/hadoop-common--*.jar

hadoop/share/hadoop/common/lib/common-configuration-*.jar

hadoop/share/hadoop/common/lib/log4j-*.jar

hadoop/share/hadoop/common/lib/slf4j-api-*.jar

hadoop/share/hadoop/common/lib/slf4j-log4j-*.jar}

->List Drivers(wait ..then class name will auto set org.apache.hive.jdbc/HiveDriver)->OK->Add aliases ->chose the hive driver->done

Hive数据迁移

1.导出表


EXPORT TABLE <table_name> TO
'path/to/hdfs'
;



2.复制数据到另一个hdfs


hadoop distcp hdfs:
//
:8020
/path/to/hdfs
hdfs:
///path/to/hdfs



3.导入表


IMPORT TABLE <table_name> FROM
'path/to/another/hdfs'
;



Hive 输出查询结果到文件

输出到本地文件:


insert overwrite local directory './test-04'
row format delimited
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ','
MAP KEYS TERMINATED BY ':'
select * from src;


输出到hdfs:

输出到hdfs好像不支持 row format,只能另辟蹊径了


INSERT OVERWRITE DIRECTORY '/outputable.txt'
select concat(col1, ',', col2, ',', col3) from myoutputtable;


当然默认的分隔符是\001

若要直接对文件进行操作课直接用stdin的形式


eg. hadoop fs -cat ../000000_0 |python doSomeThing.py

#!/usr/bin/env python

import sys

for line in sys.stdin:

(a,b,c)=line.strip().split('\001')



Hive 语法:

hive好像不支持select dicstinct col1 as col1 from table group by col1

需要用grouping sets


select col1 as col1 from table group by col1 grouping sets((col1))


Beeline:

文档:https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients

利用jdbc连接hive:


hive2='JAVA_HOME=/opt/java7 HADOOP_HOME=/opt/hadoop /opt/hive/bin/beeline -u jdbc:hive2://n1.hd2.host.dxy:10000 -n hadoop -p fake -d org.apache.hive.jdbc.HiveDriver --color=true --silent=false --fastConnect=false --verbose=true'


beeline利用jdbc连接hive若需要执行多条命令使用


hive2 -e "xxx" -e "yyy" -e...

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: