在Hive中使用Avro
2015-12-04 15:10
555 查看
http://www.iteblog.com/archives/1007
为了解析Avro格式的数据,我们可以在Hive建表的时候用下面语句:
然后用Snappy压缩我们需要的数据,下面是压缩前我们的数据:
压缩完的数据假如存放在/home/wyp/twitter.avsc文件中,我们将这个数据复制到HDFS中的/user/wyp/examples/input/目录下:
然后我们就可以在Hive中使用了:
当然,我们也可以将avro.schema.literal中的
存放在一个文件中,比如:twitter.avsc,然后上面的建表语句就可以修改为:
效果和上面的一样。
本博客文章除特别声明,全部都是原创!
尊重原创,转载请注明: 转载自过往记忆(http://www.iteblog.com/)
本文链接地址:《在Hive中使用Avro》(http://www.iteblog.com/archives/1007)
为了解析Avro格式的数据,我们可以在Hive建表的时候用下面语句:
01 | hive> CREATE EXTERNAL TABLE tweets |
02 | > COMMENT"A table backed by Avro data with the |
03 | > Avro schema embedded in the CREATE TABLE statement" |
04 | > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' |
05 | > STORED AS |
06 | > INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' |
07 | > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' |
08 | > LOCATION '/user/wyp/examples/input/' |
09 | > TBLPROPERTIES ( |
10 | > 'avro.schema.literal' ='{ |
11 | > "type" : "record" , |
12 | > "name" : "Tweet" , |
13 | > "namespace" : "com.miguno.avro" , |
14 | > "fields" : [ |
15 | > { "name" : "username" , "type" : "string" }, |
16 | > { "name" : "tweet" , "type" : "string" }, |
17 | > { "name" : "timestamp" , "type" : "long" } |
18 | > ] |
19 | > }' |
20 | > ); |
21 | OK |
22 | Time 0.076 seconds |
23 |
24 | hive> |
25 | OK |
26 | username |
27 | tweet |
28 | timestamp |
01 | { |
02 | "username" : "miguno" , |
03 | "tweet" : "Rock: , |
04 | "timestamp" : 1366150681 |
05 | }, |
06 | { |
07 | "username" : "BlizzardCS" , |
08 | "tweet" : "Works as intended. Terran is IMBA." , |
09 | "timestamp" : 1366154481 |
10 | }, |
11 | { |
12 | "username" : "DarkTemplar" , |
13 | "tweet" : "From the shadows I come!" , |
14 | "timestamp" : 1366154681 |
15 | }, |
16 | { |
17 | "username" : "VoidRay" , |
18 | "tweet" : "Prismatic core online!" , |
19 | "timestamp" : 1366160000 |
20 | } |
1 | hadoop fs -put /home/wyp/twitter.avro /user/wyp/examples/input/ |
1 | hive> select * from tweets limit 5 ;; |
2 | OK |
3 | miguno 1366150681 |
4 | BlizzardCS Works as intended. Terran is IMBA. 1366154481 |
5 | DarkTemplar 1366154681 |
6 | VoidRay Prismatic core online! 1366160000 |
7 | Time 0.495 seconds, 4 row(s) |
01 | { |
02 | "type" : "record" , |
03 | "name" : "Tweet" , |
04 | "namespace" : "com.miguno.avro" , |
05 | "fields" : [ |
06 | { |
07 | "name" : "username" , |
08 | "type" : "string" |
09 | }, |
10 | { |
11 | "name" : "tweet" , |
12 | "type" : "string" |
13 | }, |
14 | { |
15 | "name" : "timestamp" , |
16 | "type" : "long" |
17 | } |
18 | ] |
19 | } |
01 | CREATE EXTERNAL TABLE tweets |
02 | COMMENT "A table backed by Avro data with the Avro schema stored in HDFS" |
03 | ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' |
04 | STORED AS |
05 | INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' |
06 | OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' |
07 | LOCATION '/user/wyp/examples/input/' |
08 | TBLPROPERTIES ( |
09 | 'avro.schema.url' = 'hdfs:///user/wyp/examples/schema/twitter.avsc' |
10 | ); |
本博客文章除特别声明,全部都是原创!
尊重原创,转载请注明: 转载自过往记忆(http://www.iteblog.com/)
本文链接地址:《在Hive中使用Avro》(http://www.iteblog.com/archives/1007)
相关文章推荐
- 查看.a的信息
- 游戏人生(图)
- MFC编程之常用控件:滚动条控件Scroll Bar
- Linux中的文件特殊权限
- bootstrap风格的multiselect插件——类似邮箱收件人样式
- 重写init方法为什么要self = [super init]
- 光伏质检系统
- Swift开源了,有什么好处?
- Hive-命令行基本操作和java API访问hive数据库
- ionic环境搭建
- core animation初识之CALayer(一)
- SSM框架搭建问题汇总一
- asp.net简单实现页面换肤的方法
- Android KeyStore格式转换工具
- 关于微博api授权问题求一个详细步奏
- android studio错误整理
- Exception in thread "main" org.hibernate.SessionException: Session is closed!
- WPF中DataGrid使用初步
- [python]模块
- WPF:获取DataGrid控件单元格DataGridCell