Spark API 详解/大白话解释 之 groupBy、groupByKey
2017-01-20 19:40
651 查看
转载:http://blog.csdn.net/guotong1988/article/details/50556871
groupBy(function)
function返回key,传入的RDD的各个元素根据这个key进行分组
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
2
3
4
5
6
1
2
3
4
5
6
/*
结果
Array(
(0,ArrayBuffer(2, 4, 6, 8)),
(1,ArrayBuffer(1, 3, 5, 7, 9))
)
*/
groupByKey( )
2
3
4
1
2
3
4
顶0
踩
groupBy(function)
function返回key,传入的RDD的各个元素根据这个key进行分组
val a = sc.parallelize(1 to 9, 3) a.groupBy(x => { if (x % 2 == 0) "even" else "odd" }).collect//分成两组 /*结果 Array( (even,ArrayBuffer(2, 4, 6, 8)), (odd,ArrayBuffer(1, 3, 5, 7, 9)) ) */1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
val a = sc.parallelize(1 to 9, 3) def myfunc(a: Int) : Int = { a % 2//分成两组 } a.groupBy(myfunc).collect1
2
3
4
5
6
1
2
3
4
5
6
/*
结果
Array(
(0,ArrayBuffer(2, 4, 6, 8)),
(1,ArrayBuffer(1, 3, 5, 7, 9))
)
*/
groupByKey( )
val a = sc.parallelize(List("dog", "tiger", "lion", "cat", "spider", "eagle"), 2) val b = a.keyBy(_.length)//给value加上key,key为对应string的长度 b.groupByKey.collect //结果 Array((4,ArrayBuffer(lion)), (6,ArrayBuffer(spider)), (3,ArrayBuffer(dog, cat)), (5,ArrayBuffer(tiger, eagle)))1
2
3
4
1
2
3
4
顶0
踩
相关文章推荐
- 仪仗队_SDOI2008_洛谷2158_数论
- 升级安装Windows10后如何删除Windows.old文件
- LayIM项目之基础数据获取代码优化,Dapper取代ADO.NET
- 36 Python 让你对Python绝望的几个题
- 二叉排序树(java)
- PHP阿里大鱼短信验证
- scoped_array源码剖析
- C语言,C++编译遇到问题:has no member named ''XXX‘
- 基于Delphi实现客户端服务端通信Demo
- 1048. Find Coins (25) PAT 甲级
- BZOJ 1043: [HAOI2008]下落的圆盘 计算几何,贪心,线段交
- Android-传感器开发-方向判断
- IDHTTP用法详解 good
- 解决“Unable to find vcvarsall.bat”错误
- 恺风的Android开发相关文章--转自恺风的博客
- poj 3080 Blue Jeans (串)
- CheckBox报错Error inflating class android.widget.CheckBox, Caused by: android.content.res.Resources$No
- Delphi通过POST传递参数给PHP
- Delphi通过Get获取来自PHP的返回值
- [Codeforces 455D] Serega and Fun (分块)