grouping sets,cube,rollup,grouping__id,group by
2015-09-16 13:55
501 查看
例1:
hive -e"
select
type
,status
,count(1)
from
usr_info
where pt='2015-09-14'
group by type,status
grouping sets ((type,status),( type),());
">one.txt
Grouping sets按照各种指定聚类汇总方式,如group by type,status grouping sets ((type,status),( type),())
表示group by type,status union all group by type union all group by ()
得到
type status _c2
NULL NULL 69467
1 NULL 68216
1 1 63615
1 2 540
1 4 4061
2 NULL 891
2 1 873
2 2 18
3 NULL 360
3 1 340
3 4 20
例2:
hive -e"
select
type
,status
,count(1)
from
usr_info
where pt='2015-09-14'
group by type,status with rollup;
">two.txt
group by type,status with rollup按照以type为主的固定聚类汇总方式,如同group by type,status grouping sets ((type,status),( type),()) ,不过形式已经固定了,表示group by type,status union all group by type union all group by ()
得到
Type status _c2
NULL NULL 69467
1 NULL 68216
1 1 63615
1 2 540
1 4 4061
2 NULL 891
2 1 873
2 2 18
3 NULL 360
3 1 340
3 4 20
例3:
hive -e"
select
type
,status
,count(1)
from
usr_info
where pt='2015-09-14'
group by type,status with cube;
">three.txt
group by type,status with cube按照以type和status为主的固定聚类汇总方式,如同group by type,status grouping sets ((type,status),( type),(status),()) ,不过形式已经固定了,表示group by type,status union all group by type union all group by status union all group by ()
得到
Type status _c2
NULL NULL 69467
NULL 1 64828
NULL 2 558
NULL 4 4081
1 NULL 68216
1 1 63615
1 2 540
1 4 4061
2 NULL 891
2 1 873
2 2 18
3 NULL 360
3 1 340
3 4 20
例4:
hive -e"
select
type
,status
,grouping__id
,count(1)
from
usr_info
where pt='2015-09-14'
group by type,status with cube;
">five.txt
type
,status
,grouping__id
grouping__id(两条横线)函数判断其参数是否参与了分组,如果参与则返回1,如果没有参与了分组则返回0
而其多个参数的形式则将其每个参数进行grouping__id运算后返回的值拼成二进制后转换为十进制返回,
grouping_id(argn,...,arg2,arg1)=grouping_id(argn)*2^(n-1)+...+grouping_id(arg2)*2^1+grouping_id(arg1)*2^0('^'表示幂运算)。
Hive中grouping__id不带参数,用法见例子。
得到
type status grouping__id _c3
NULL NULL 0 69467
NULL 1 2 64828
NULL 2 2 558
NULL 4 2 4081
1 NULL 1 68216
1 1 3 63615
1 2 3 540
1 4 3 4061
2 NULL 1 891
2 1 3 873
2 2 3 18
3 NULL 1 360
3 1 3 340
3 4 3 20
hive -e"
select
type
,status
,count(1)
from
usr_info
where pt='2015-09-14'
group by type,status
grouping sets ((type,status),( type),());
">one.txt
Grouping sets按照各种指定聚类汇总方式,如group by type,status grouping sets ((type,status),( type),())
表示group by type,status union all group by type union all group by ()
得到
type status _c2
NULL NULL 69467
1 NULL 68216
1 1 63615
1 2 540
1 4 4061
2 NULL 891
2 1 873
2 2 18
3 NULL 360
3 1 340
3 4 20
例2:
hive -e"
select
type
,status
,count(1)
from
usr_info
where pt='2015-09-14'
group by type,status with rollup;
">two.txt
group by type,status with rollup按照以type为主的固定聚类汇总方式,如同group by type,status grouping sets ((type,status),( type),()) ,不过形式已经固定了,表示group by type,status union all group by type union all group by ()
得到
Type status _c2
NULL NULL 69467
1 NULL 68216
1 1 63615
1 2 540
1 4 4061
2 NULL 891
2 1 873
2 2 18
3 NULL 360
3 1 340
3 4 20
例3:
hive -e"
select
type
,status
,count(1)
from
usr_info
where pt='2015-09-14'
group by type,status with cube;
">three.txt
group by type,status with cube按照以type和status为主的固定聚类汇总方式,如同group by type,status grouping sets ((type,status),( type),(status),()) ,不过形式已经固定了,表示group by type,status union all group by type union all group by status union all group by ()
得到
Type status _c2
NULL NULL 69467
NULL 1 64828
NULL 2 558
NULL 4 4081
1 NULL 68216
1 1 63615
1 2 540
1 4 4061
2 NULL 891
2 1 873
2 2 18
3 NULL 360
3 1 340
3 4 20
例4:
hive -e"
select
type
,status
,grouping__id
,count(1)
from
usr_info
where pt='2015-09-14'
group by type,status with cube;
">five.txt
type
,status
,grouping__id
grouping__id(两条横线)函数判断其参数是否参与了分组,如果参与则返回1,如果没有参与了分组则返回0
而其多个参数的形式则将其每个参数进行grouping__id运算后返回的值拼成二进制后转换为十进制返回,
grouping_id(argn,...,arg2,arg1)=grouping_id(argn)*2^(n-1)+...+grouping_id(arg2)*2^1+grouping_id(arg1)*2^0('^'表示幂运算)。
Hive中grouping__id不带参数,用法见例子。
得到
type status grouping__id _c3
NULL NULL 0 69467
NULL 1 2 64828
NULL 2 2 558
NULL 4 2 4081
1 NULL 1 68216
1 1 3 63615
1 2 3 540
1 4 3 4061
2 NULL 1 891
2 1 3 873
2 2 3 18
3 NULL 1 360
3 1 3 340
3 4 3 20
相关文章推荐
- flex画图demo
- Swift第三课 分支语句 if else for forin switch while do-while
- 轻松自动化---selenium-webdriver(python) (十)
- 在公司玩的小游戏
- js前端页面常用字段验证(持续更新)
- 扩展欧几里得算法的理解
- POI读取Excel浅谈
- AngularJS使用 ng-options 实现传值给后台controller
- 紫金桥组态软件与西门子S7-200的以太网联接
- 对面向过程、面向对象、类、对象的理解
- 常见的适配器总结
- C++结构体中sizeof
- hive数据库的一些应用
- python日积月累之isnumeric()
- Hibernate入门
- 禁用Enter键表单自动提交
- Oracle GoldenGate快速入门教程:基本概念和配置
- Solr web服务器管理界面用法
- 非常详细的Android开发环境搭建教程
- HadoopDoctor:来自腾讯数据仓库TDW的MR诊断系统