Flink学习笔记 --- 理解DataStream WordCount
2017-07-20 10:28
525 查看
pom,xml 内容如下:
其中的代码如下:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>zetyun</groupId> <artifactId>FlinkWordCounts</artifactId> <version>1.0-SNAPSHOT</version> <inceptionYear>2008</inceptionYear> <properties> <scala.version>2.11.0</scala.version> </properties> <dependencies> <dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>${scala.version}</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-core --> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-core</artifactId> <version>1.3.0</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-clients_2.11 --> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-clients_2.11</artifactId> <version>1.3.0</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-scala_2.11 --> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-scala_2.11</artifactId> <version>1.3.0</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-streaming-scala_2.11 --> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-scala_2.11</artifactId> <version>1.3.0</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-streaming-core --> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-core</artifactId> <version>0.9.1-hadoop1</version> </dependency> </dependencies> </project>
其中的代码如下:
package zetyun import org.apache.flink.streaming.api.scala._ import org.apache.flink.streaming.api.windowing.time.Time /** * Created by ryan on 17-7-19. */ object DataStreamWordCount { def main(args: Array[String]) { val env = StreamExecutionEnvironment.getExecutionEnvironment val text = env.socketTextStream("192.168.1.81", 9999) val counts = text.flatMap { _.toLowerCase.split("\\W+") filter { _.nonEmpty } } // convert into lower and filter empty value .map { (_, 1) } // put every char in text into (char, 1) format .keyBy(0) // use the ( char, 1) first element hash function .timeWindow(Time.seconds(5)) // use the window transformation .sum(1) // sum the same key's value counts.print env.execute("Window Stream WordCount") } }
相关文章推荐
- Flink学习笔记 --- 理解DataSet WordCount
- Flink学习笔记 --- scala实现Flink的DataSet Source进行WordCount
- hadoop学习笔记(三)mapreduce程序wordcount
- Spark学习笔记-如何运行wordcount(使用jar包)
- Hadoop学习笔记---1.wordcount程序的剖析
- Hadoop学习笔记(六)实战word count
- HADOOP的学习笔记 (第四期) eclipse 执行 wordcount
- Flink学习笔记 --- DataStream Transformations
- 《征服c指针》学习笔记-----统计文本单词数目的程序word_count
- Spark2.x学习笔记:17、Spark Streaming之HdfsWordCount 学习
- Hadoop学习笔记-WordCount源码分析
- Hadoop学习笔记(1):WordCount程序的实现与总结
- Spark2.x学习笔记:16、Spark Streaming入门实例NetworkWordCount
- Spark学习笔记——安装和WordCount
- [Big Data]菜鸟的Hadoop (Before YARN) 学习笔记 (一) WordCount
- Hadoop 之 Wordcount 单词计数 (学习笔记)
- hadoop学习笔记之wordcount
- Spark学习笔记@第一个例子wordcount+Eclipse
- Hadoop学习笔记之初识MapReduce以及WordCount实例分析
- 第11课:彻底解密WordCount运行原理学习笔记