spark--transform算子--distinct
2017-07-18 11:15
316 查看
import org.apache.spark.{SparkConf, SparkContext} /** * Created by liupeng on 2017/6/16. */ object T_distinct { System.setProperty("hadoop.home.dir","F:\\hadoop-2.6.5") def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("distinct_test").setMaster("local") val sc = new SparkContext(conf) val list = List(1,2,3,1,4,5,4,7,1) var rdd = sc.parallelize(list) //distinct方法用于对本身的数据集进行去重处理 val result = rdd.distinct() .foreach(println) //如果是键值对的数据,kv都相同,才算是相同的元素 val list1 = List(("liupeng", 120), ("liupeng", 120), ("liusi", 120)) val rdd1 = sc.parallelize(list1) val result1 = rdd1.distinct() .foreach(println) } }
运行结果:
4
1
3
7
5
2
liusi,120)
(liupeng,120)
相关文章推荐
- spark--transform算子--parallelized
- spark--transform算子--union
- Spark算子--distinct
- Spark算子:RDD基本转换操作(1)–map、flagMap、distinct
- Spark算子:RDD基本转换操作(1)–map、flagMap、distinct
- spark--transform算子--filter
- spark--transform算子--flatMap
- Spark算子:RDD基本转换操作(1)–map、flagMap、distinct
- 【Spark篇】---SparkStreaming算子操作transform和updateStateByKey
- Spark编程之基本的RDD算子之glom,substract,substractByKey,intersection,distinct,union
- Spark算子详解之reduceByKey_sample_take_takeSample_distinct_sortByKey_saveAsTextFile_intersection
- spark--transform算子--groupByKey
- Spark算子[06]:union,distinct,cartesian,intersection,subtract
- Spark的Transform算子和Action算子列举和示例
- spark--transform算子--cartesian
- spark--transform算子--intersection
- spark--transform算子--reduceByKey
- 【spark】Spark算子:RDD基本转换操作–map、flagMap、distinct
- spark--transform算子--coalesce
- Spark算子:RDD基本转换操作(1)–map、flagMap、distinct