您的位置:首页 > 数据库

第71课:Spark SQL窗口函数解密与实战

2016-05-23 00:00 260 查看
摘要: Spark 学习

伯克利官网介绍:

https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html

Using Window Functions

Spark SQL supports three kinds of window functions: ranking functions, analytic functions, and aggregate functions. The available ranking functions and analytic functions are summarized in the table below. For aggregate functions, users can use any existing aggregate function as a window function.

SQLDataFrame API
Ranking functionsrankrank
dense_rankdenseRank
percent_rankpercentRank
ntilentile
row_numberrowNumber
Analytic functionscume_distcumeDist
first_valuefirstValue
last_valuelastValue
laglag
leadlead

相关代码如下:

[code=language-scala]import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.{SparkConf, SparkContext}

/**
* Created by Bindy on 16-5-13.
*
* @author DT大数据梦工厂-学员
*
*/
object s71_SparkSQLWindowFunctionOPS {

def main(args: Array[String]) {
val conf = new SparkConf()
.setAppName("SparkSQLWindowFunction")
.setMaster("spark://cloud001:7077")
val sc = new SparkContext(conf)
sc.setLogLevel("WARN")

val hiveContext = new HiveContext(sc)

/**
* 如果要创建的表存在的话就删除,然后创建我们要导入数据的表
*/
hiveContext.sql("use default")
hiveContext.sql("DROP TABLE IF EXISTS scores") //删除同名的Table
hiveContext.sql("CREATE TABLE IF NOT EXISTS scores(name STRING, score INT) " +
"ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' LINES TERMINATED BY '\\n'") //创建自定义的Table
hiveContext.sql("LOAD DATA LOCAL INPATH '/home/hduser/IMF_Study/testData/topNGroup.txt' " +
"INTO TABLE scores")

/**
* 使用子查询的方式完成目标数据的提取,在目标数据内部使用窗口函数row_number来进行分组排序:
* partition by:指定窗口函数分组的key;
* order by:分组后进行排序;
*/
val result = hiveContext.sql("select name,score " +
"from (" +
"select " +
"name," +
"score," +
"row_number() over (partition by name order by score desc) rank " +
"from scores" +
") sub_scores " +
"where rank <=4")
result.show()
//把数据保存到Hive数据库中
hiveContext.sql("DROP TABLE IF EXISTS sortResultScores") //删除同名的Table
result.write.saveAsTable("sortResultScores")
}
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息