您的位置:首页 > 其它

Flink学习笔记 --- 理解DataStream WordCount

2017-07-20 10:28 525 查看
pom,xml 内容如下:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion>
<groupId>zetyun</groupId>
<artifactId>FlinkWordCounts</artifactId>
<version>1.0-SNAPSHOT</version>
<inceptionYear>2008</inceptionYear>
<properties>
<scala.version>2.11.0</scala.version>
</properties>

<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-core -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-core</artifactId>
<version>1.3.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-clients_2.11 -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.11</artifactId>
<version>1.3.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-scala_2.11 -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-scala_2.11</artifactId>
<version>1.3.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-streaming-scala_2.11 -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-scala_2.11</artifactId>
<version>1.3.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-streaming-core -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-core</artifactId>
<version>0.9.1-hadoop1</version>
</dependency>
</dependencies>
</project>


其中的代码如下:

package zetyun

import org.apache.flink.streaming.api.scala._
import org.apache.flink.streaming.api.windowing.time.Time
/**
* Created by ryan on 17-7-19.
*/
object DataStreamWordCount {
def main(args: Array[String]) {
val env = StreamExecutionEnvironment.getExecutionEnvironment
val text = env.socketTextStream("192.168.1.81", 9999)
val counts = text.flatMap { _.toLowerCase.split("\\W+") filter { _.nonEmpty } }   // convert into lower and filter empty value
.map { (_, 1) }       // put every char in text into (char, 1) format
.keyBy(0)             // use the ( char, 1) first element hash function
.timeWindow(Time.seconds(5))  // use the window transformation
.sum(1) // sum the same key's value

counts.print

env.execute("Window Stream WordCount")
}
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: